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said recombination host cell, and 0 screening for positive polypeptide variants. 
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Despite the existence of the above methods there are still need for 
even better iterative in vivo recombination methods for preparing 
novel positive polypeptide variants.' 



SUMMARY OF THE INVENTION 

The object of the present invention is to provide an improved method 

for preparing positive polypeptide variants'" t*y an in vivo 
10 recombination method. 

The inventor of the present invent! or. have surprisingly found that 
such positive polypeptide variants may advantageously be prepared bv 
shuffling different nucleotide sequences of homologous DNA sequences 
15 by in vivo re combina t ion comprising cne steps of 

a} forming at least one circular pias~ic comprising a DNA sequence 
encoding a polypeptide, 

b) opening said circular piasmid [s within the ON A sequence ( s) 
20 encoding the poi ypept ice i s ] , 

c) preparing at least one DNA fragment comprising a DNA sequence 
homologous to at least a part c: the polypeptide coding region on at 
least one of the circular piasmic { s) , d) introducing at least one 
of said opened piasmic ;si, together with at least one of said 

25 homologous DNA fragment -.s; cover in; full-length DNA sequences 
encoding said poi ypept ice ,s , or parts :hereof, into a recombination 
host ceil, 

e) cultivating said recombination host ceil, and 

f) screening for positive polypeptide variants. 
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BRIEF DESCRIPTION OF THE DRAWINGS 



Figure 1 shows the yeast expression piasmic pJS026 comprising DNA 
sequence encoding the Humicola lanuginosa lipase gene. 
35 Figure 2 shows the yeast expression piasmid pJS037, comprising DNA 
sequence encoding the Humicola lanuginosa lipase gene containing 
twelve additional restriction sites. 
Figure 3 shows the piasmid pJS026. 
Figure 4 shows the piasmid pJS037. 

Figure 5 shows the in vivo recombination of the 0.9 kb synthetic 
wild-type Humicola lanuginosa lipase with pJS037 using Saccharomycas 
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cerevisiae as the recombination host cell (described in Example 1). 
cigure 6 shows the in vivo recombination of a DNA fragment prepared 
from Kumicola lanuginosa lipase variant (y) with Huxicola lanuginosa 
lipase variant (d) comprised in a plasmid using Sacctaromyces 
cerevisiae as the recombination host cell (described in Example 2). 
Figure 7 shows an overview over the location of the inactivation 
site of the Huxicola lanuginosa lipase gene and the number of the 
clone (referred to as "blue number" in the tables). Location of 
restriction enzvme sites and clone' numbers are relative to the 
initiation codon of the lipase gene. Ir. all cases a stop «don was 
located in the new reading frame 10 tc 50 bp from the frameshift. 
Figure S shews ar. overview of the creator, of active humico.a 
lanuginosa lipase genes from the recombinations in table 2A ana 3 
by a "mosaic mechanism", ^mes i..- 

fragment sequence into the vector and lines with a :< indicate 
seouences that are not introduced in the active lipase colonies. 
The orimers used for the ?C?. fragment are shown together with tne 
location of the frameshift mutation (marked by the restriction site 
used for the construction) . 

Fioure 9 shows an overview of fragments used in the recordation 
of 2 oartiai overlaying fragments into a gapped vector. The 
orimers used for the ?C?. fragr.ents are shown together with the 
location of the frameshift mutation (if not wild type). 
Figure 10 shows an overview of fragments used in the recombination 
of 3 oartiai overlaying fragments into a gapped vector. The 
primers used for the PGR fragments are shown. The overlap between 
PCR353 and 355 is only a 10 bp. 

DETAILED DESCRIPTION O? THE INVENTION' 

The object of the present invention is to provide an improved method 
for preparing positive polypeptide variants by an iterative in vavo 

recombination method. 

The inventor of the present invention have surprisingly found an 
efficient method for shuffling homologous DMA sequences ir. an an 
vivo recombination system using a eukaryotic cell as a recombination 

host cell.. 
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A -recombination hose cell" is in the context of the presen 
invention a cell capable of mediating shuffling of a number o 
homologous DNA seauences. 

9 

5 

The term "shuffling" means recombination of nucleotide sequence (s, 
between two or more homologous DNA sequences resulting in output DN* 
seauences (i.e. DNA seauences having been sub-Jeered to a shuffling 
cycle, having a number of nucleotides exchanged, in comoarison to 
10 the lnput DNA sequences (i.e. starting point homologous DN , 
sequences) . ^ 

An -portan: advantage of the mventicn is that mosaic DNA sequences 
— muici ? ie replaces: points or retirements, rot rela;ed „ che 

" discovered in Pompo-' s 



opening site, is treated, whirr. i« 



method. 

^ other mortar: advantage of the present invention is tha= whe , 
-ing a fixture of fragments an, opened vectors iia tne s:reeninc 
0 se.^up, it gives the possibility c f .any different clones to 

a couple of 



recombme pairwise or even tripiewise :as can be seen in 
examples below, . 



:: - - sivo recombination metnoc cf the invention simple to perform 

' rSSCltS 8 — — " of homologous oenes or 

variants. A large number of variants . or homologous genes can be 
1XS= la ° ne ««»*or=ati S r.. Ths ni *i"S cf improved variants or w «»h 
type genes followed by screening increases the number of -un- 
improved variants manyfoid compared to doing onlv rand ^ 
mutagenesis. 

Recombination of multiple overlapping fragments is oossib^ „,- th a 
hign efficiency increasing the mixing of variants or homologous 
genes using the in vivo recombination method. An overiao as small as 
10 bp ls sufficient for recombination which may be utilized for verv 
easy -domain shuffling of even distantly related genes. 

The invention relates to a method for preparing polypeptide variants 
by shu.flmg different nucleotide sequences " of homologous DNA 
sequences by in vivo recombination comprising the steps of 
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a) forming at least one circular plasmid comprising a DNA sequence 
encoding a polypeptide, 

b) opening said circular plasmid(s) within the DNA sequence (s) 
encoding the polypeptide (s) , 

c) preparing at least one DNA fragment comprising a DNA sequence 
homologous to at least a part of the polypeptide coding region on at 
least one of the circular plasmid(s), d) introducing at least one 
of said opened plasmid(s), together with at least one of said 
homologous DNA fragment (s) covering full-leffgtsh- DNA sequences 
encoding said polypeptide (s) or parts thereof, into a recombination 
host cell, 

e) cultivating said recordation host ceil, and 

f) screening for positive polypeptide variants. 

^ . ^ _ ^ . _ . ^ _ : z ~ ~ c .- s zr.s.7. one cvcie of step a} to f) rr.ay 
be performed. 

\ c - - - b) b^ directed toward anv 

The ooenmt o: tne p_as...-c ^ , U) 

~^yo*o-ide c:::r.: region of the plastic. The 

site w 1 1 n n _ . i e 

^ ~ . , -, - . . : - p b * ^ n^tnods known in the art. 
ola-id(s) r.ay oe openec ry a..v * 

- ^ ^.i-.--,-^ ~ = ■ - f - 1 ied-i n vith nucleotides as 
The openeo encs o : -..e ^.as...---* — 

\ . * ■ ;i9Br), suora). It is preferred not to 

cesc noec m rc.ipo. . - - — ■ * - ^ ~ 

fill in the opened encs as it ir.icht create a fra^eshift. 

It is preferred to open the pias-id(s) around the rtiddie of the 

. . . • „ ( c , as this is believed to result in 

ooivoeotice cocm r - — * 

.... between DNA f raraent ( s ) and opened 

a more e I recti-, e ~e. — . 

plasniid(s) . 

■ a<^-~ n-" r-o invention the DNA fragment (s) is (are) 
In an errjoodimen^ o~ t..- iuvc 

-<r^ .pc U i^-.: in a lew, rtedium or high randon 
prepared unaer condi uiows .esu. - 

mutagenesis frequency • 

To obtain low mutagenesis frequency the DNA sequence (s) (comprising 
th- DNA fragment (s J) may be prepared by a standard PCR amplification 
method (US 4,633,202 or Saiki et al., (1988), Science 239, 487 - 
491)". 

A medium or high mutagenesis frequency may be obtained by performing 
the PCR amolification under conditions which increase the mis- 
incorporation of nucleotides, for instance as described by Deshler, 
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(1992), GATA 9(4), 103-106; Leung et al., (1989), Technique, Vol. 1 
No. 1, 11-15. 

It is also contemplated according to the invention to combine th 
PCR amplification (i.e. according to this embodiment also DN< 
fragment mutation) with a mutagenesis step using a suitable physical 
or chemical mutagenizing agent, e.g., one which induces transitions, 
transyersions, inversions, scrambling, deletions, and/or insertions. 

In the context of the present invention the term "positive doIv- 
peptide variants" means resulting polypeptide variants possessinc 
functional properties which has beer, improved in comparison to the 
polypeptides producible from the corresponding input DMA sequences . 
Examples, of such improved properties can be as different as e.g. 
biological activity, enzyme washing performance, antibiotic resis- 
tance etc. 

Consequently, -hi oh screening ~e::.:: : o be used for identifying 
positive variants depend on the desired improved property of the 
polypeptide variant in question. 

If, for instance, the polypeptide in question is an enzyme and the 
desired improved functional property is the wash performance, the 
screening in step f ', may conveniently be performed by use of a 
filter assay based or. the following principle: 

The recombination host ceil is incur a ted on a suitable medium and 
under suitable conditions for the enzyme to be secreted, the medium 
being provided with a double filter comprising a first protein- 
binding filter and cn top of that a second filter exhibiting a low- 
protein binding capability. The recombination host cell is located 
on the second filter. Subsequent to the incubation, the first filter 
comprising the enzyme secreted from the recombination host cell is 
separated from the second filter comprising said cells. The first 
filter is subjected to screening for the desired enzymatic activity 
and -the corresponding microbial colonies present on the second 
filter are identified. 

The filter used for binding the enzymatic activity may be any 
protein binding filter e.g. nylon or nitrocellulose. The topfilter 
carrying the colonies of the expression organism may be any filter 
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8 

that has no or low affinity for binding proteins e.g. cellulose 
acetate or DuraporeO. The filter may be pre-treated with any of the 
conditions to be used for screening or may be treated curing the 
detection of enzymatic activity. 

The enzymatic activity may be detected by a dye, fluorescence, 
precipitation, P K indicator, IR-absorbance or any other known 
technique for detection of enzymatic activity. 

The detecting compound may be immobilized by any immobilizing agent 
e.g. agarose, agar, gelatine, polyacrylamide, starch, filter paper, 
cloth; or any combination of immobilizing agents. 

If the improved functional proper:/ of the polypeptide is not 
sufficiently coco after one cycle ci shuffling, the polypeptide may 
be subjected to another cyc_e. 



.nventicr. at .east one shuffling cycle is a 
:he initially used DNA fragment, which may 



In an embodiment ci the 
backcrcssing cycle with 
20 be the wild-type C>iA fragment. This eliminates non-essential muta- 
tions. Non-essential mutations may also be eliminated by using wild- 
type ON*.-, fragments as the initially used input DNA material. 

It is to be understood that the nethcd of the invention is suitable 
25 for all types of polypeptide, including enzymes such as proteases, 
amylases, lipases, cutinases, amylases, celiulases, peroxidases and 
oxidases . 

Also contemplated according to the invention is polypeptides having 
30 biological activity such as insulin, ACTH, glucagon, somatostatin, 
somatotropin, thymosin, parathyroid horrr.one, pigmentary hormones, 
somatomedin, erythropoietin, luteinizing hormone, chorionic 
Gonadotropin, hypothalamic releasing factors, antidiuretic hormones, 
thyroid stimulating hormone, reiaxir., interferon, thrombopoietin 
35 (TPO) and prolactin. 

Especially contemplated according to the present invention is 
initially to use input DNA sequences being either wild-type, variant 
or modified DNA sequences, such as a DNA sequences coding for wild- 
40 type, variant or modified enzymes, respectively, in particular 
enzymes exhibiting lipolytic activity. 
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In an embodiment of the invention the lipolytic activity is a lipas 
activity derived from the filamentous fungi of the Humicola sp., ir 
particular Humicola lanuginosa, especially Humicola lanuginosa . 

In a specific embodiment of the invention the initially used input 
DNA fragment to be shuffled with a homologous polypeptide is the 
wild-type DNA sequence encoding the Humicola lanuginosa lipase 
derived from Humicola lanuginosa DSM 4109 described in E? 305 216 
(Novo Nordisk A/S) . 

Also specifically encompassed by the scope of the invention is input 
DNA sequences selected from the croup of vectors (a) to (:) and/or 
DNA fragments fg) to ( aa } coding for Humicola lanuginosa lipase 
v a r i a n t s f r om the list below i r. t h -a X a t e r i a 1 and Method section. 

Throughout the preser.c application cr.e name Hu.r.i zoia lanuginosa has 
been used to identifv one ore far ret' parent enzvrr.e, i.e. the one 
mentioned immediately above. However, in recent years H. lanuginosa 
has also beer, termed T'r.erzozycas lanuginosus [a species introduced 
the first time by Tsiklinsky in since the fungus show 

morphological and physiological similarity to Thermomyces 
lanuginosus . Accordingly, it will be understood that whenever 
reference is made to H. lanuginosa this term could be replaced bv 
Thermomyces lanuginosus . The DNA encoding part of the 19S ribosomai 
gene from The rmomyce s lanuginosus (or H. lanuginosa) have been 
sequenced. The resulting 185 sequence was compared to other 18S 
sequences in the Gen3ank database and a phyiogenetic analysis using 
parsimony (?AU?, Version 3.1.1, Smithsonian Institution, 1993) have 
also been made. This clearly assigns Thermomyces lanuginosus to the 
class of Pieczomycazes , probably to the order of £uro ziales . 
According to the Entree Browser at the NC3I (National Center for 
Biotechnology Information), this relates Thermomyces lanuginosus to 
families like Eremascaceae, Monoascaceae, Pseudoeurotiaceae and 
Trichocomaceae, the latter containing genera like Emericella , 
Aspergillus, Penicillium, Eupenicillium, Paecilom'yces , Talaromyces , 
Thermoascus and Sclerocleista . 

Consequently, such genes encoding lipolytic enzymes of filamentous 
fungi of the genera Emericella , Aspergillus, Penicillium, 
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Zupenicillium, Paecilonyces , Talaromyces, Thezmoascus and 
Sclerocleista are also specifically contemplated according to the 
present invention. 

5 Other examples of relevant filamentous fungi genes encoding 
lipolytic enzymes include strains of the Absidia sp. e.g. the 
strains listed in WO 96/13578 (from Novo Nordisk A/S) which are 
hereby incorporated by reference. Absidia sp. strains listed in WO 
96/13578 include Absidia blakesleeana , Absidia" " corymbif era and 
10 Absidia reflexa. 

Strains of Rhizopus sp., in particular Rh . niveus and Ph. oryzea are 
also contemplated according to the invention. 

15 The lipolytic gene may also be -e rived from a bacteria, such as a 
strain of the ?se'jzo~or.a s sp.. in particular Ps . zraci, Ps . 
sr-izer:, Ps. cepacia ar.d Ps . f i -cr-s sens (WO " 3/043511, or Ps . 
oiarzarii or Ps . ~iab-Loi: ;'JS 4 , r SC, 4 1~ ; or Ps . alcaiiger.es and Ps . 
pseudoaicaiigenes ;E? 215 272, E? 331 37 5, or WO 34/25573 

20 (disclosing variants of the Ps . pseudzalcal'igenes lipolytic enzyme), 
the Pseudoaonas sc. variants diseased in E? 4C7 225, or a 
Psevdo-onas sp. lipolytic enzyme, such as the Ps . -endocina (also 
termed Ps . pucica) lipolytic enzyme described in WO 38/09357 and US 
5,359,535 or variants thereof as described in US 5,352,594, or Ps . 

25 aurc-gir.osa or Ps . g-umae, or Ps . syrir.gae, or Ps . ^isconsinensis (WO 
96/12012 fro- icivay, :: a strain c: SaciiiL'S sp., e.g. the 3. 
subziiis described by Oartois et ai . , (1993) Biochemica et 
3iophysica acta 1131, 253-250, or 5, szearothermophilus (J? 
64 /77 4 4932) or B. pumilus (WO 91/16422) or a strain of Streptomyces 

30 sp., e.g. S. scabies, or a strain of Chromoba cterium sp. e.g C. 
vis cos um . 

In connection with the Pseudo.zonas sp. lipases it has been found 
that lipases from the following organises have a high degree of 

35 homology, such as at least 60% homology, at least 80% homology or at 
least' 901 homology, and thus are contemplated to belong to the same 
family of lipases: Ps . ATCC2180S, Pseudomonas sp. lipase 
commercially available as Liposam®, Ps. aeruginosa EF2 , Ps. 
aeruginosa PAC1R, Ps . aeruginosa PAOl, Ps . aeruginosa TE 3285, Ps. 

40 sp. 109, Ps. pseudoalcaligenes Ml, Ps . glumae, Ps. cepacia DSM 3959, 
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Ps. cepacia M-12-33, Ps. sp. KWI-56, Ps. putida IFC 3458, Ps. putida 
IFO 12049 (Gilbert, E. J., (1993), Pseudomonas lipases: 3iochemicai 
properties and molecular cloning. Enzyme Microb. Technol., 15, 634- 
645) . The species Pseudomonas cepacia has recently been reclassified 
5 as Burkholderia cepacia, but is termed Ps. cepacia in the present 
application. 

Also genes encoding lipolytic enzymes from yeasts are relevant, ans 
include lipolytic genes from Candida sp., in "'particular Candida 
10 rugosa, or Geotrichum sp. , in particular Geotrichum candidum. 

Specific examples of microorganisms comprising genes encoding 
lipolytic enzymes used for commercially available products and which 
may serve as donor of genes :o be shuffled according to the 
15 invention include Hu~icola lanuginosa , used in Lipciase®, Lipolase-3 
Ultra, Ps . mendocina used in Lumaf as t®, Ps . a lea ligenes used in 
Lipomax®, Fusariur: soiani, Bacillus sp. (US 5427S35, £? 523323;, 
Ps . mendocina , used in Liposam/X. 

20 Also the Pseudor.onas sp. lipase gene shown in SEQ ID NO 14 are 
specifically contemplated according tc the invention. 

It is to be e-phasizec that genes en:oc::c lipolytic enzyme to be shuffle: 
according tc zr.e inventicn r,ay be an, of the above mentioned genes c: 

Id iipc lytic enzyrr.es a no any variant, modification, or truncation thereof. 
Ixa-pj.es of such genes whi ch are specifically contemplated include the 
genes encoding the enzyr.es described in WO 92/05249, WO 94/01541, W0 
9 4/14951, WO 94/25577, WO 95/22615 and a protein engineered lipase variants 
as described in £? 407 225; a protein engineered Ps . mendocina lipase as 

30 described in US 5,352,594; a cutinase variant as described in WO 94/14964; 
a variant of an Aspergillus lipolytic enzyr.e as described in E? patent 
167,309; and Pseudononas sp. lipase described in WO 95/06720. 

A request to the DNA sequences, encoding the polypeptide (s ) , to be 
35 shuffled, is that they are at least 60%, preferably at least 70%, 
better more than 80%, especially more than 90%, and even better up 
to almost 100% homologous. DNA sequences being less homologous will 
have less inclination to interact and recombine. 
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It is also contemplated according to the invention to shuffle parent 
(homologous) wildt type organisms of different genera. 
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Further, the DNA fragment (s) to be shuffled may preferably have a 
length of from about 20 bp to 8 kb, preferably about 40 bp to 6 kb, 
more preferred about 80 bp to 4 kb, especially about 100 bp to 2 kb, 
to be able to interact optimally with the opened plasmid. 

The method of the invention is very efficient for preparing po- 
lypeptide variants in comparison to prior art method comprising 
transforming linear DNA fragments/sequences. -_-.r._ 

The inventor found that the transformation frequency of a mixture of 
ooened oiasmid and a DNA fragment were significantly higher than 
when trar.sform.inc a oiasmid cut at the same site alone. The trans- 
formation frequency of the opened plasmic and DNA fragment were as 
high as fcr.ur.cu: piasmic. 

W — c- b-r.c limited tc ar.v theory it is believed that the opening 
of the oiasmid<s> restrict - the replication of (opened) piasmid(s) 
when, not interacting with at least cr.e DNA fragment. In accordance 
with this ar. increased number of rercmomed DNA sequences were found 
after only one shuffling cycle. 

is described in Example 1 50% of the resulting transf ormants 
contalnedTecombined :NA sequences of both input DNA sequences. As 
hich as 2D% of the total number of recombined DNA sequences were 
"random" mixtures 'i.e. having more than one region of nucleotides 
exchanged! . 

The inout DNA sequences may be any DNA sequences including wild-type 
DNA secuences, DNA sequences encoding variants or mutants, or 
modifications thereof, such as extended or elongated DNA sequences, 
and mav also be the outcome of DNA sequences having been subjected 
to one' or more cvcles of shuffling (i.e. output DNA sequences) 
according to the method of the invention or any other method (e.g. 
any of the methods described in the prior art section) . 

When "using the method of the invention the output DNA sequences 
(i.e. shuffled DNA sequences), have had a number of nucleotide (s) 
exchanged. This results in replacement of at least one amino acid 
within the polypeptide variant, if comparing it with the parent 
oolyoeotide. It is to be understood that also silent mutations is 
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contemplated (i.e. nucleotide exchange which does not result in 
changes in the amino acid sequence) . 

However, the method of the present invention will in most cases lead 
5 to the replacement of a considerable number of amino acid and may in 
certain cases even alter the structure of one or more polypeptide 
domains (i.e. a folded unit of polypeptide structure). 

According to the present invention more than two" T5NA sequences are 
10 shuffled at the same time. Actually any number of different DNA 
fragments and homologous polypeptides comprised in suitable plasmids 
may be shuffles at the same time/ This is advantageous as a vase 
number of quite different variants car. be made rapidly without ar. 
abundance of iterative procedures. 

The i-ventor have tested the nucleotide snuffling method of the 
invention using significantly r.;:e than two homologous DNA 
sequences. As Describee m Example 2 was surprisingly found that 
the method of the invention advantageously can be used for 
20 reccwir,:::: more than two DNA sequences. 

One cycle of shuffling according to t.ne method o: the invention may 
result in the exchange of fro™ 1 to ICjj nucleotides into the opened 
piasmid DNA sequence encoding the polypeptide in question. The 
25 exchanged nucleotide sequence s ) may be continuous or may be present 
as a number of sub-sequences within the full-length sequence(s). 

To support the present invention the inventor made a number of 
additional experiments on different aspect on the method of the 
30 invention. The experiments are described below and illustrated in 
the Example 2 to 6 below. 

A number of vectors and fragments comprising an inactivated 
synthetic Humicola lanuginosa lipase genes were constructed by 

35 introducing frameshi ft/stop ccdon mutations in the lipase gene at 
various positions. These were used for monitoring the in vivo 
recombination of different combinations of opened vector (s) and DNA 
fragments. The number of active lipase colonies were scored as 
described in Example 3. The number of colonies determines the 

40 efficiency of the opened vector (s) and fragment (s) recombination. 
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One frameshift mutation in said Humicola lanuginosa lipase gene in 
the opened vector and another in the fragment on the opposite side 
of the opening site gave 3 to 32% of active lipase colonies 
depending on the location and combination. It was concluded that 
the closer that the mutation is at the ends of the vector the 
higher mixing. 

One frameshift mutation in the opened vector and two in the 
fragment on each side of the opening site gave 4~ro""42% of active 
colonies depending on the location and combination. Seme of these 
active colonies can be considered to be mosaics, not only related 
to the opening site. 

Two frameshift: mutations ir. the opene: vector on each side of the 
ooer. ing site and one ir. the f ragmen t r;ve 0.5 to 2.1% of active 
colonies ceoe r.zir. r ;r, the location, ar.c combinat i or. . Most of zr.es a 
active colonies are :.:sa::s o :'. tr.e "parent" DNA . 

Twc frameshift mutations i r. the opened vector on each side of the 
ooeninc site and a wile tvoe fragment rave 7.7 to 10.7% of active 
colonies deoendino on the location. 



it was a iso route 
and the size of t 



;at the amount of vectors relative to fragments 
fraom.ents are also influencing the result. 



;j sinc 0 f t ; ne s. ce revise a e rad52 mutants as the recombination host 
ceil showed that the zac52 mutant transformed very well with wild 
tvoe plasmic { s ) and expressed the Huirdcola lanuginosa lipase gene, 
but gave no transit rmants at ail with the opened vectors and 
f ragments . 

The RAD 5 2 function is recuired for "classical recombination" {but 
not for unequal sister-strand mitotic recombination) showing that 
the recombination of opened vector and fragment could involve a 
classical recombination mechanism. 

Classical recombination is the recombination mechanism involved in 
the recombination between genes located on nonsister chromatids of 
homologous chromosomes as defined in for example Petes TD, Malone 
RE and Symington LS (1991) "Recombination in Yeast", page 407-522, 
in The Molecular and Cellular Biology of the Yeast Saccharomyces , 
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Volume 1 (eds. 3roach JR, Pringle JR and Jones EW) , Cold Sorino 
Harbor Laboratory Press, New York. 

Multiple partially overlapping fragements 

The inventor also tested recombination of multiple partial 

overlapping fragments using the method of the invention. 

The recombination of 2 and 3 partial overlapping fragments into a 
gapped (i.e. that the opening result in cutting" out' of a little 
part of the gene} vector were tested and gave a high recovery of 
recombined Humicola lanuginosa lipase gene. The recovery of active 
lipase gene from different combinations of inactivated Humicola 
lanuginosa genes was tested for the recombination of 2 partial 
overlapping fragments. The tendency was a higher mixing in the 
overlapping region between the 2 fragments in the gapped region 
than in the ve:::: and fragment overlap. 

When recombining many fragment from one same region, the multiple 
overlapping fragment technique will increase the mixing by itself, 
but it is also important to have a relative high random mixinc in 
overlapping regions in order to mix closely located 
va r i an t s /d i f f e r e n oe s . 



An overlap as small as i: bp between two fragments were found to be 
sufficient to obtain a very efficient recombination. Therefore," 
overlapping in tne range from 5 to 50C1 bp, preferably from 10 bp to 
500 bp, especially 10 bp to 100 bp is suitable according to the 
method of the invention. 



According to this embodiment of the present invention 2 or more 
overlapping fragments, preferable 2 to 5 overlapping fragments, 
especially 2 to 4 overlapping fragments may advantageously be used 
as input fragments in a shuffling cycle. 

Besides increasing the mixing of genes, this is a very useful 
method for domain shuffling by creating small overlaps between DNA ' 
fragments from different domains and screen for the best 
combination . 

For instance, in the case of three DNA fragments the overlapping 
regions may be as follows: 
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- the first end of the first fragment overlaps the first end of the 
opened plasmid, 

- the first end of the second fragment overlaps the second end of 
the first fragment, and the second end ' of the second fragment 
overlaps the first end of the third fragment, 

- the first end of the third fragment overlaps (as stated above) the 
second end of the second fragment, and the second end of the third 
fragment overlaps the second end of the opened plasmid. 

It is to be understood that when using two or more DNA fragments as 
starting material it is preferred to have continues overlaps between 
the ends of the plasmid and the DNA • fragments . 

Even though it is preferred to shuffle homologous DNA sequences in 
the form 'of DNA fragment (s: and opened piasr.id'.s) , it is also 
contemplated according to the invent . or. to shuffle two or more 
ooened piasmids comprising hcmclctcus DNA sequences encoding 
polypeptides. However, ir. suor. case it is compulsory to open the 
piasmids a: different sites. 

In an further embodiment t: the invention two or more opened 
piasmids and one or more homologous DNA fragments are used as the 
'startino material to be shuffled. The ratio between the opened 
Diasmid(s) and homologous DNA fragment (s! preferably lie in the 
'range from 2C:1 to 1:50, preferable from 2:1 to 1:10 (moi vector :mol 
fra _ 5n . s) ---e specific concentrations being- from 1 ?M to 10 K 

of the DNA. 

The ooened olasmi-ds may advantacousiy be gapped in such a way that 
the overlap' between the fragments is deleted in the vector in order 
to select for the recombination) . 
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Preparing the DNA fragment 

The DNA fragment to be shuffled with the homologous polypeptide 
comprised in an opened plasmid may be prepared by any suitable 
method. For instance, the DNA fragment may be prepared by PC?. 
5 amplification (polymerase chain reaction}, as described above, of a 
plasmid or vector comprising the gene of the polypeptide, using 
specific primers, for instance as described in US 4,683,202 or Saiki 
et al., (1988), Science 239, 487 - 491. The DNA fragment nay also be 
cut out from a vector or plasmid comprising the "des-ired DNA sequence 
10 by digestion with restriction enzymes, followed by isolation using 
e.g. electrophoresis. 

The DNA fragment encoding the homologous polypeptide in question may 
alternatively be prepared synthetically by established standard 

15 methods , e.g. the phcsphosmidi t e me tr.o d described by Seauoage and 
Caruthers, (1931;, Tetrahedron letters 22, 1 E 5 9 - 13£9, or the 
method described by Matthes et al . . ; 1 r B 4 ) , Z>'.BZ Journal 2, 301 - 
505. According to the phosphoamidi t - method, oligonucleotides are 
synthesized, e.g. in an au::~a:;: :NA synthesizer, purified, 

20 annealed, iigated ar.c oioned in suitacle vectors. 

Furthermore, the DNA fragment may be of mixed synthetic and genomic, 
mixed synthetic and cDNA or mixed genomic and cDNA origin prepared 
by ligatmg fragments of synthetic, genomic or c DNA origin (as 
25 appropriate), the fragments corresponding to various parts of the 
entire DNA sequence, in aooordance with standard techniques. 

The plasmid 

The plasmid comprising the DNA sequence encoding the polypeptide in 
30 question may be prepared by Heating said DNA sequence into a 
suitable vector cr plasmid, or by any other suitable method. 

Said vector may be any vector which may conveniently be subjected to 
recombinant DNA procedures. The choice of vector will often depend 
35 cr. the recombination host ceil into which it is to be introduced. 

Thus, the vector may be an autonomously replicating vector, i.e. a 
vector which exists as an extrachromosomal entity, the replication 
of which is independent of chromosomal replication, e.g. a plasmid. 
40 Alternatively, the vector may be one which, when introduced into the 
recombination host cell, is integrated into the host cell genome and 
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replicated together with the chromosome (s ) into which it has been 
integrated. 

To facilitate the screening process it is preferred that the vector 
5 is an expression vector in which the DNA sequence encoding the 
polypeptide in question is operably linked to additional segments 
required for transcription of the DNA. In general, the expression 
vector is derived from a plasmid, a cosmid or a bacteriophage, or 
may contain elements of any or all of these. 

10 

The term, "operably linked" indicates that the segments are arranged 
so that they function in concert for their intended purposes, e.g. 
transcription initiates in a promoter and proceeds through the DMA 
sequence coding ::r the polypeptide ir. question. 

15' 

The promoter may be any DNA sequence which shows transcriptional 
activitv ir. the recombination h o s : cell of choice and may ire derived 
from genes e:.:::::.: proteins, such as enzymes, either homologous or 
heterologous to the host c - - . 

20 

Examples of suitable promoters fcr use in yeast host cells include 
promoters from yeast glycolytic genes (Hitzeman et al . , < 1980) , J. 
Biol . Chem. 255, 12073 - 120:0; Aiber and Kawasaki, (1932), J*. Mol. 
Appl. Gen. 1, 4 19 - 4 34; or alcohol dehydrogenase genes (Young et 
25 al . , in Genetic Engineering of Microorganisms for Chemicals 
(Hoiiaender et al, eds . ; , Plenum Press, New York, 1932), or the T?I1 
(US 4,599,311) or AjK2-4c (Russell et al., (1983), Nature 304, 652 - 
654) promoters. 

30 Examples of suitable promoters for use in filamentous fungus host 
cells are, for instance, the ADK3 promoter (McKnight et al., (1985), 
The EM BO J. 4, 2093 - 2099) or the tpiA promoter. Examples of other 
useful promoters are those derived from the gene encoding A. oryzae 
TAKA amylase, Rhizomucor miehei aspartic proteinase, A. niger neu- 

35 tral a-amylase, A. niger acid stable a-amylase, A. niger or A. 
awanoci glucoamylase (giuA) , Rhizcxucor miehei -lipase, A. oryzae 
alkaline protease, A. oryzae triose phosphate isomerase or A. 
nidulans acetamidase. Preferred are the TAKA- amylase and gluA 
promoters ; 
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The DNA sequence encoding polypeptide in question invention may 
also, if necessary, be operably connected to a suitable terminator, 
such as the human growth hormone terminator (Palmiter et al., op. 
cit.) or (for fungal hosts) the TPI1 (Alber and Kawasaki, od^ cit. ) 
5 or ADH3 (McKnight et ai . , op^ cit. ) terminators. The vector may 
further comprise elements such as polyadenylation signals (e.g. from 
SV40 or the adenovirus 5 Elb region), transcriptional enhancer 
sequences (e.g. the SV40 enhancer) and translational enhancer 
sequences (e.g. the ones encoding adenovirus VA "RNfts-y . 

10 

The vector may further comprise a DNA sequence enabling the vector 
to replicate in the recombination h'ost cell in question. 
When the host cell is a yeast cell, suitable sequences enabling the- 
vector tc replicate are the yeast piasmid 2m replication genes ?.£?,• 
15 1-3 and origin of replication. 

The piasmid p:l car. be used ::r prccurcien of useful proteins and. 
peptides, using filamentous fungi, ^uch as Aspergillus sp . , anc J . 
yeasts as re ecru? mar.: host cells : J?j 2 4 5777-A) . 

20 

The vector may also ccmprise a selectable marker, e.g. a gene the 
product of which cc~pie-er.es a defect in the recombination host 
ceil, such as the gene ceding for dihverof ola te reductase ( DKFR) or 
the Schizosacchazc-yzes pemire T?I ger.e {described by P.P.. Russell, 
25 (1935) , Gene nO, 125-120: . 

Another example of such suitable selective markers are the ura3 and 
ieu2 genes which complements the corresponding defect genes of e.g. 
the yeast strain 5a scha ror.yces cerevisiae YNG31S. 

30 

The vector may also comprise a selectable marker which confers 
resistance to a drug, e.g. ampiciilin, kanamycin, tetracyclin, 
chloramphenicol, neomycin, hycromycin or methotrexate. For fi- 
lamentous fungi, selectable markers include amdS, pyrG , aro3 , niaD , 
35 sC^ trpC , pyr4 , and DHFR . 

To direct the polypeptide in question into the secretory pathway of 
the recombination host cell, a secretory signal sequence (also known 
as a leader sequence, prepro sequence or pre sequence) may be 
4 0 provided in the recombinant vector. The secretory signal sequence is 
joined to the DNA sequence encoding the lipolytic enzyme in the 
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correct reading frame. Secretory signal sequences are commonly 
positioned 5' to the DNA sequence encoding the polypeptide. The - 
secretorv signal sequence may be the signal normally associated with 
the polypeptide in question or may be from a gene encoding another 
5 secreted protein. 

The signal peptide may be naturally occurring signal peptide, or a 
functional part thereof, or it may be a synthetic peptide. For 
secreticr. from yeast cells, suitable signal peptides have been found 

10 to be the a-factor signal peptide (cf. US 4,870,008), the signal 
peptide of mouse salivary amylase (cf. 0. Hagenbuchle et al . , 
(1981), Nature 289, 643-645), a .modified carboxypeptidase signal 
peptide (cf. L.A. Vails et al., ;1937), Ceil 43, 887-397), the 
Hc-icola lanuginosa lipase signal pep-ice, the yeast 3AR1 signal 

15 peptide (cf. WO 57/02570), er the yeas; aspartic protease 3 (YAP3) 
signal peptide :=f. K. Sgel-Xitar.i et ai., [1592], Yeast €, 127- 
137) . 

For efficient secretion in yeast, = sequence encoding a leader 
20 peptide -ay aisc be inserted downstream cf- the signal sequence and 
upstream cf the DNA sequence encoding the polypeptide in question. 
The function of the leader peptize is to allow the expressed 
ooiypeptide to be directed from the endoplasmic reticulum to the 
Goici apparatus and further to a secretory vesicle for secretion 
25 into the culture medium {i.e. exportation of the polypeptide across 
the cell wall or at least through the cellular membrane into the 
peripiasmic space cf the yeast ceil'.. The leader peptide may be the 
yeast a-factcr leader (the use of which is described in e.g. US 
4,546, 052, E? 15 201, E? 123 294, E? 123 544 and ET' 153 529). 
30 Alternatively, the leader peptide may be a synthetic leader peptide, 
which is to say a leader peptide not found in nature. Synthetic 
leader peptides may, for instance, be constructed as described in WO 
89/02463 or WO 92/11378. 

35 For use in filamentous fungi, the signal peptide may conveniently be 
derived from a gene encoding an Aspergillus- sp. amylase or 
giucoamyiase, a gene encoding a Khizomucor miehei lipase or 
protease, a Himicola lanuginosa lipase. The signal peptide is 
preferably derived from a gene encoding A. oryzae TAKA amylase, A. 

40 niger neutral a-amylase, A. niger acid-stable amylase, or A. niger 
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glucoamylase . 

The recombination host cell 

The recombination host cell/ into which the mixture of plas- 
5 mid/fragment DNA sequences are to be introduced, may • be any 
eukaryotic cell, including fungal cells and plant cells, capable of 
recombining the homologous DNA sequences in question. 

According to prior art prokaryotic microorganisms)— such as bacteria 
10 including Bacillus and £. coli; eukaryotic organisms, such as 
filamentous fungi, including Aspergillus and yeasts such as 
Saccharomyces cerevisiae; and tissue culture cells from avian or 
mammalian origins have been suggested for in vivo recombination . All 
of said organisms can be used as recombination host cell, but in 
15 general prokaryotic cells are not su: ficientiy effective (i.e. dees 
not result in a sufficient number : : variants) to be suitable for 
recombination methods for industrial use. 

Consequently, preferred ** ccmcma 1 1 : n host ceils according to the 
20 present invention are fungal ceils, such as yeast ceils or filament- 
ous fungi . 

Examples of suitable yeast cells include ceils of Saccha rcr.yces sp . , 
in particular strains zf Saccharcr/yzes cerevisiae or Saccharo~yces 

25 ki uvveri or S ch i zc s a c cr. a rcr.yce s sp., Methods for transforming yeast 
ceils with heterologous DNA and producing heterologous polypeptides 
therefrom are described, e.g. in US 4,599,311, US 4,931,373, US 
4,870,008, 5,037,743, and US 4,845,075, all of which are hereby- 
incorporated by reference. Transformed cells may be selected by, 

30 e.g., a phenotype determined by a selectable marker, commonly drug 
resistance or the ability to grow in the absence of a particular 
nutrient, e.g. leucine. A preferred vector for use in yeast is the 
POT1 vector disclosed in US 4,931,373. The DNA sequence' encoding the 
polypeptide may be preceded by a signal sequence and optionally a 

35 leader sequence, e.g. as described above. Further examples of 
suitable yeast cells are strains of Kluyveromyces , such as K. 
lactis, Kansenula, e.g. H. polymorph*, or Pichia, e.g. P. pastoris 
(cf. Gleeson et al.,(1986), J. Gen. Microbiol. 132, 3459-3465; US 
4, 862, 279) . 

40 
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Examples of other fungal ceils are cells of filamentous fungi, e.g. 
Aspergillus sp., Neurospora sp., Fusarium sp. or Trichoderma sp., in 
particular strains of A. oryzae, A. nidulans or A. niger. The use of 
Aspergillus sp. for the expression of proteins is described in, 
5 e.g., E? 272 277, EP 230 023. The transformation of F. oxysporuir, 
may, for instance, be carried out as described by Malardier et al., 
(1989) , Gene 78, 147-156. 



In a preferred embodiment of the invention the- ^recombination host 
10 cell is a cell of the genus Saccharomyces , in particular 5. 
cerevisiae . 



METHODS AND MATERIALS 

. 5 

DN'A secuence : 



;ase e":;c:n; DNA s e ru e n z i 



7 0 



Xunico^a lanuginosa lipase \ 
V a r i a r. z s u s 2 d z c r creoarir 



to 25 cos ne c 



Exa^.ole 2 : 



(a) 



>5R, 05 7 L , I 9 OF 



(b) E55R, D57L, V50X 

(c) D57G,N94K, D96L 

(d) E37K,G91A, D95R 



09 51 
D62N 
L 9 7 X 



:<2E 

. I2 52L, ?25ST, G2S3A, L2S4Q 
LA, F95L, 095?, K95I, {K237M) 



(f) E210K 



Variants used for preparing DNA fragments by standard ? C? 
30 amplification in Exa.T.oie 2; 

(g) S33T,N94K, D95N 

(h) E87K,D96V 

(i) N94K, D96A 

(j) E37K / G91A / D96A 
35 (k) D167G,E210V 

(1) S83T,G91A,Q249R 
(m) E87K,G91A 

(n) S83T,E87K,G91A,N94K, D96N, D1I1K. 
(o) N73D,E87K,G91A,N94I, D96G. 
40 (p) L67P, I76V,S83T,E87N, I 90N , G9 1A, D95A,K98R. 
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(q) S83T,E87K,G91A,N92H,N94K, D96M 

(s) S85P,E87K,G91A, D96L,L97V. 

(t) E87K,I90N,G91A,N94S,D96N, HOOT. 

(u) I34V,S54P,F80L,S85T,D96G,R108W,G109V, D111G,S116?,L124S, 
V132M,V140Q,V141A, F142S, K145R, N162T, I166V, F181P,F183S, 
R205G, A243T, D254G, F262L. 
(v) E56R, D57L, I90F, D96L,E99K 
(x) E56R,D57L,V60M,D62N,S83T, D96P,D102E 
(y) D57G,N94K, D96L, L97M -_r.... 
(z) E87K, G91A, D96R, I100V, E129K, K237M, I252L, P256T, G263A, L264Q 
(aa) E56R, D57G,S53F, D62C, T64R, E87G, G91A, F95L, D96P, K98I 



Strains : 

Expressior. systerr. host: 
Saccharoxyces cerevisiae YN3313: 
his 4-539 

Saccharorr/ces cerevisiae Rac.52 : 



MAT f. 3c ec 4 f c i r * j u r a 3 - 5 2 , 1 e u 2 - D2 
Szrair. XI 53 3 - rlATa rad52 ura 2 



obtained from 7c r seer. Nils sen . illzrer. , Institute cf Genetics 
University cf Copenhagen, 

Plasr?,ids : 

pJS026 (see figure 3) 
pJS037 (see figure 4} 
d YES 2.0 ( I nvi t r ccren } 

T r a n s f o rrr. a tier, selective marker 

ura 3 

ieu2 

Media 

SC-ura": 90 ml 10 x 3asai salt, 22.5 rr.l 20% casanino acids, 9 ml 1 
tryptophan, H : 0 ad 806 mi , autociavec, 3 .6 ml 5% threonine and 90 m 
20% glucose or 20% galactose added. 

LB -medium : 10 g Bacto-tryptcne , 5 c 3acto yeast extract, 10 g NaC 
in 1 litre water. 

Bril-liant Green (3G) (Merck, art. No. 1.01310) 

BG-reagent: 4 mg/ml Brilliant Green (3G) dissolved in water 

Substrate 1: 

10 ml olive oil (Sigma CAT NO. 0-1500) 

20 ml 2% polyvinyl alcohol (PVA) 



WO 97/07205 



24 



PCT/DK96/00343 



The Substrate is homogenised for 15-20 minutes. 
Methods : 



5 Construction of veast expr ession vector 

The expression piasmids pJS026 and pJS037, are derived from pYES 
2.0. The inducible GAL1 -promoter of pYES 2.0 was replaced with the 
constitutively expressed TPI (triose phosphate isomerase) -promoter 
from Saccharomyces cerevisiae (Albert and Karwasa-ki>~ ( 1982 ) , J. Mol . 
10 Appl Genet., 1, 419-434), and the ura3 promoter has been deleted. A 
restriction map of pJS026 and pJS037 is shown in figure 3 and figure 
4, respectively. 

Preparation of the wild-type PSA fragment 
15 A lipase wild-type ON A fragment can he prepared either by PC?, 
amplification {resulting in lew, medlar. :: high mutagenesis), of the 
pJS025 piasmic or by cutting the :XA fragment cut by digesting with 
a suitable restriction enzyme. 



20 


Termentat. 


ion cf ~>j~icols Izr.uzir.csa nz. 




10 mi cf 


SC-ura' medium is inccuiatec 




and grown 


at 30 3 C for 2 cays. The 10 m! 




mi SC-ura 


' medium which is grown at 2Z 




used fc-r 


inoculation 5 1 cf the follow! 


2 5 


4 00 g 


Amicase 




6.7 g 


yeast extract ;-ifcc; 




12.5 c 


L-Leucm (Tiuka) 




c.7 g 


(NH 4 ) 2 S0 4 




10 c 


MgSO ; '7H 2 0 


30 


17 g 


K 2 SO, 




10 mi 


Trace compounds 




5 mi 


Vitamin solution 




6.7 mi 


H3PO; 




25 mi 


20% Piuronic (antifoami 



■se variants, in yeast 
with a 5. cerevisiae colony 
is used for inoculating 300 
: for 3 days. The'- 300 mi is 
\r G-substrate: 



35 

In a total volume of 5000 ml: 

The yeast cells are fermented for 5 days at 30°C. They are given a 
start dosage of 100 ml 70% glucose and added 400 ml 70% glucose/day. 
A pH=5.0 is kept by addition of a 10% NH 3 solution. Agitation is 300 
4 0 rpm for the first 22 hours followed by 900 rpm for the rest of the 
fermentation. Air is given with 11 air/l/min for the first 22 hours 
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followed by 1.5 1 air/l/min for the rest of the fermentation. 



Trace compounds : 
6.8 g 2nCl 2 

54.0 g FeCl 2 '6H 2 0 

19.1 g MnCl 2 *4H 2 0 
2.2 g CuSCV5H 2 0 
2.58 g CoCl 2 
0.62 g H 3 B0 3 

0.024 g (NH 4 ) 5 MO7O 2 .-4H 2 0 
0.2 g KI 

100 ml HC1 (concentrated) 
In a total volume of 1 1 . 

Vitamin solution. : 

250 mg Biotin 

3 g Thiamin 

10 c C a i c i ump a r.:r,e:: r. a t 

100 c Myo-Incsitoi 

50 g Choiinchiond 

1-6 g Pyridoxin 

1.2 g Niacinamid 

C4 g Foiicacid 

C4 g Riboflavin 

In a total volume cf 11 . 



Transformation of yeast 

Saccha romyces carevis iae is trans formed by standard methods (cf. 
Sambrooks et ai., (1989), Molecular Cloning: A Laboratory Manual, 
2nd Ed., Cold Spring Harbor) 

Determination of yeast transformation frequency 

The transformation frequency is determined by cultivating the 
trans formants on SC-ura'plates for 2 days and counting the number of 
colonies appearing. The number cf trans formants per mg opened 
plasmid is the transformation frequency. 

Screening for positive variants with improved wash performanc e 

The following filter assay can be used for screening positive 

variants with improved wash performance. 
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Low calcium filter assay 

D Provide SC Ura" repiica plates (useful for selecting strains 
carrying the exoression vector) with a first protein binding filter 
(Nylon membrane) and a second low protein binding filter (Cellulose 
acetate) on the top. 

2) Soread veast cells containing a parent lipase gene or a mutated 
lipase gene on the double filter and incubate £« 2 or 3 days at 
30°C. 

3) Keep the colonies on the top filter by transferring the top- 
filter to a new plate. 

4) Remove the orotein binding filter to an empty petri dish. 

5) .Pour ar. acarose solution comprising an olive oil emulsion (21 
»vA: olive 011=3:1), Brilliant green indicator^. 004%) , 100- tris 
b! ,- for 9 K9 and EGTA (final concentration 5m>!) cn the bottom filter 
so as -.o identify colonies expressing lipase activity in the form of 
biue-creer. s;c:s. 

6) Id'er.tifv colonies found in see? : having a reduced dependency 
for calcium as compared to the parent lipase. 

c = - x-. •■s ; -r aooiied Hiosvstems A3I DNA 

DNA secue nc.ng was p= = 

to tr.e protocol in the A3 1 Dye 



sequence model 273A accorcir.g 
Term.ir.ator Cycle Sequencing kit. 

Assessmc the effiencv ^combination 

The number of colonies determines tr.e efficiency of the opened 
v— - and fracment recombination . The percentage of colonies with 
active iioase 'activity gives an estimate of the mixing of the 
active and inactive genes - theoretically it can be calculated for 
one frameshift that the closer to 5G« the better mixing if equal 
likelihood of wild type and frameshift, 25% for 2 frameshifts and 
12.5% for 3 frameshifts. 

Frameshift nutation 

The frameshift mutation were created either by filling in a 
restriction site (in case of 5' overhang) or deleting the "sticky 
ends" (in case of 3' overhang) by T4 DNA polymerase with or without 
dNTP (deoxynucleotides = equal amounts of dATP, dTTP, dCTP and 
dGTP). Methods for filling in of restriction sites (referred to as 
"F" on Figure 7) and deleting the sticky ends (referred to as 
"(D)" on Figure 7) are well known in the art. 
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Method for assessing colonies with lipase activity 

The number of colonies and positives (i.e. with lipase activity) are 
calculated as the average of 3 plates. 

The cultivation condition and screening condition used is the 
following: 

1) Provide SC Ura-plates with a protein binding filter (Nylon 
filter) onto the plate. 

2) Spread yeast cells containing a parent lipase gene or a mutated 
lipase gene on the filter and incubate for 3 or 4 days at 30°C. 

3) Remove the protein binding filter with the colonies to a petri 
dish containing: An agarose solution comprising an olive oil 
emulsion {2\ PVA: Olive oil-2 : 1 ) , Brilliant green (indicator, 0 . 004% ) , 
100 m>! tris buffer pH 9 . 

5} Identify colonies express IT.; lipase activity in the form cf blue- 1 
green spots . 

EXAMPLES 
Elx amo 1 e I 

Testing in vivo recombina ti or, of two homologous cer.es 

The Sa ccr.a rem vces rerevisiae expression plasmid o JS02 5 was 

constructed as desoribed above in the "Material and Me thods " - 

section. 

A synthetic Xuxiccla lanuginosa lipase gene (in p J5037 ) containing 
12 additional restriction sites (see figure 4) was cut with Nrul , 
Pst I , and Nrul and ?stl , respectively, to open the gene 
approximately in the middle of the DMA sequence encoding the lipase. 

The opened plasmid (pJS037) was transformed into Saccharomyces 
cerevisiae YNG318 together with an about 0.9 kb wild-type Humicola 
lanuginosa lipase DNA fragm.ent (see figure 1) prepared from pJS02 6 
by ?CR amplification. 

Further, the opened plasmid was also transformed into the yeast 
recombination host cell alone (i.e. without the 0.9 kb synthetic 
lipase DNA fragment) . 

The transformed yeast cells were grown as described in the "Ma- 
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terials and Method"-section above, and the transformation frequency 
was determined as described above. 

It was found that the transformation frequency of the opened plasmid 
alone was very low (10 transf ormants per mg opened plasmid), in 
comparison to the transformation frequency of said plasmid/f ragment 
(50,000 transformants per mg opened plasmid). 

The plasmid/f ragment was PCR amplified resulting-irr-20 transformants 
containing fragments covering the lipase gene region of the 
recombined plasmid/f ragments . The recombination mixture of the 20 
transformants were analyzed by restriction site digestion using 
standard methods. The result is displayed in Table 1. 

Table 2 

\ : ** u T 'net t e s t e d ) 



? 1 
















w t 


w t 




sg 


5 g 










w t 




wt 


?3 


sg 


sg 


sg 




sg 




sg 


sg 


nd 


P4 




sg 


sg 










nd 


nd 


?5 














wt 


w t 


wt 


?5 


sg 


sg 


sg 




sg 


sg 


sg 


sg 


nd 


Nl 




w t 








sg 




wt 


wt 


N2 


















wt 












w t 






w ^ 


wt 


* i A 

l'i 1 




sg 








w 1 




wt 


wt 


N5 


sg 


sg 


sg 




w t 


w: 


wt 


wt 


wt 


N5 


wt 








sg 


sg 


sg 


sg 


sg 


P/Nl 


sg 


sg 


sg 




wt 


wt 


wt 


wt 


' wt 


P/N2 


sg 


sg 


sg 




sg 


sg 


sg 


sg 


nd 


P/N3 


sg 


sg 


sg 




wt 




sg 


sg 


sg 


P/N4 


sg 


sg 


sg 




sg 


sg 


sg 


sg 


nd 


P/N5 


sg 


sg 


sg 




sg 


sg 


sg 


sg 


nd 


P/N6 


sg 


sg 


sg 




wt 


nd 


sg 


sg 


sg 


P/N7 


nd 


wt 


wt 




wt 


nz 


wt 


nd 


wt 


P/N8 


sg 


sg 


sg 




wt 


wt 


wt 


sg 


nd 


P: Dlasmid oDened w 


ith Ps 


tl 












N: Plasm 


id od< 


ened w 


ith NRuI 












P/N : plasmid 


opened with 


PstI 


and 


NRuI 


(result 


ing 


in th 


a 75 bp 


fragment) 
















wt: wild 


-type 


gene 


restriction 


enzyme pa 


ttern 






sg: synthetic 


gene 


restri 


ction 


enzyme pattern 







nd: not determined 



As can bee seen from Table 1 10 transformants (equivalent to 50%) 
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contained recombined DNA sequences. 4 of these 10 DNA sequences 
(equivalent to 20%) contained either a. region of the wild-type gene 
recombined into the synthetic gene or a region of the synthetic gene 
recombined into the wild-type fragment. 

5 

Example 2 

In vivo recombination of Humicola lanuginosa lipase variants 
The DNA sequences of 20 variants of the Humicola 'lanuginosa lipase 
10 were in vivo recombined in the same mixture. 

,■ 

Six vectors were prepared from the lipase variants (a) to (f) (see 
the list above) by ligation into the yeast expression vectcr pJ5037 . 
All vectors were cut open with Nrui . 

15 

DNA fragment of all 20 homologous :NA sequences \z) to (aa) (see the 
list above) were prepared zy ?CR amplification using standard 
methods . 

20 The 20 DNA fragments and the z opened vectors were mixed and 
transformed into the yeast 5azzr.arzr.yzes cerevisiae TOG3 13 by 
standard methods. The recombination host ceil was cultivated as 
described above and screened as described above. About 20 trans- 
former, ts were isolated and tested for improved wash performance 

25 using the filter assay method described in the "Material and 
Methods " -section . 

Two positive t rar.s f orm.ant s (named A and 3} were identified using the 
filter assay. 

30 

In comparison to the wild-type amino acid sequence the two re- 
combined positive transf ormants had the following mutations. 

A: D57G, N94K, D96L, P25ST 

A is^a recombination of two variants. 
originates from the vector (d) 

==== originates from the DNA fragment prepared from variant (y) 

B: D57G, G59V, N94K, D96L, L97M, S116P, S170P, N249R 
40 ???? <<<<< ????? ===== 

B is a recombination of vector (c), DNA fragments (n) and (u) . 
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originates from the vector (c) 

«<< originates from the DNA fragment prepared from variant (u) 
===== originates from the DNA fragment prepared from variant (n) 
???? Amino acid mutation which is not a result of recombination. 

5 

As can be seen the resulting positive variants have been formed by 
recombination two or more variants . The amino acid mutations marked 
"?????" are not a result of in vivo recombination/ as none of the 
shuffled lipase variants (see the list above) comprise any of said 
10 mutations. Consequently, these mutations are a result of random 
mutagenesis arisen during preparation of the DMA fragments by 
standard PCR amplification. 

15 Example 2 

Reccmbinat ion v:::. cr.e frames hi ft mutar.tion.s 

Synthetic H~~izz:~ lar.'jgir.zsa lipase gene {in vector JS03"} was 
20 mace inactive at various positions by deleting (positions 134/335} 
or filiinc-in {position 2 90/5 1 7 / 5 1 5 /" 4 6 ; restriction enzyme sites 
or by si te-directea introduction cf a stop codon. All inactive 
synthetic lipase genes cf 510 bp can be deduced from Figure 7) . 

2 5 A r. uttjd e r of different 9 1 !• op DNA. fragments were made f rem the above 
vectors usino primer 4 6 9 r anc primer 5164 using standard PCR 
technique. Scalier ? Z ?, fragments were made using primer 8 4 57 and 
primer 4543 {25000}, primer 2343 and primer 4543 (438bp>. 

30 0.5 mi (app. 0.1 mc ) cf vectors Blue 425, Blue 426, Blue 423 and 

Blue 429, opened with ?st I {i.e. position 335), vectors Blue 424 
and 31ue 425 opened with Nrul {i.e. position 464) were together 
with 3 mi {app. 0.5 mg ) of fragments 424, 425, 426, 428, 429 in 
varies combination transformed into 100 ml Sacchronyces cerevisiae 

35 YNG313 competent ceils as displayed in Table 1A. 

The number of colonies and positives {i.e. with lipase activity) 
were calculated as the average of 3 plates as described in the 
Material and Methods section. 

40 



The result of the test is shown in Table 1A 
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Table 1A 



vector + Fragment 


Number of 
colonies 


% of colonies with active 
lipase activity 


I. Slue 428 + 429* 


774 


16% 


2. Blue 429 + 428$ 


645 


3% 


J. Slue 42o + 425# 


O *7 C 




4. Slue 425 + 426 


528 


„ ..18* 


5. Slue 425/Nru I 
+ 426 


539 


28% 


5. Slue 425 + 424 


139 


7% 


7. Slue 424/NruI * 
425^ 




32% 


8. 3iue 425 - 425 j 


t 

El ; 


12% 


9. Siue 425 - vr | 
fracTnenc: j 


317 i 


37% 



?airw;se recomci r.a cior.s of one framesnift mutation on the vector 



and another or. the fragment on the crposite side of the opening 
5 site. = decerminec by 9 plates; - determined by 5 plates. 

The first 2 revs : : Table 1.-. displays vectors and f ragmen t s with a 
frameshift or. eatn sice of the ?stl site. The "mirror image" 
experiment ir. rev 2 compared to row 1 gives a reproducible lower 

10 number of active colonies. The same is true for row 3 and 4 even 
though it is net ~. s pronounced. Moving the opening site closer to 
the frame shift in the vector increases the number of actives as 
seen in row 5. This tan explain the reason for the difference in 
the "mirror image" experiments. In both cases the higher number of 

15 positives has the opening site closer to the frameshift in the 
vector . 

It can therefore be concluded that the closer the mutation is to 
the end of the vector the higher char.ee of mixing. This is probably 
20 arising from the well known fact that free DNA ends have a high 

recombinogenic potential. Therefore it is desirable to have as many 
free DNA ends as possible to increase the mixing of the genes. This 
is for example obtained in the later example with recombination of 
multiple overlapping fragments. 

25 

Row 6 has a rather low number of actives probably due to the 
location of the frameshift on the fragment exactly at the ?stl 
opening site of the vector. 
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Row 7 has the frameshift of the vector close to the opening site 
and again it gives a high number of actives. 

Recombination with one stop codon mu tantions 
5 In order to test if there are any difference in the recombination 
efficiency of stop codon mutations compared to frameshift mutations 
the following experiments were made.. 

The same way as described above 0 . 5 ml {app. 0 . I mgT" vectors Blue 
10 624, Blue 625 and Blue 626 (see Table IB) opened with PstI 

comprising stop codons at specified positions {positions 1S4, 317 
and 745; respectively) (perpared by site-directed mutagenesis) were 
together with 3 mi (app. 0.5 mg) of fragments 524, 525 and 525 
transformed into 100 mi Sacchronyces cerevisiae YNG319 competent 
15 ceils in varies combine tier; as displayed in Table 15. 



Table IB 





Vector r 
Fra omen : 




dumber of 
colonies 


< of colonies wizh lipase 
activi zy 


1 


Blue 52 5 
524 


T 




40% 


2. 


Slue 624 
525 


r 


N D 


12% 




Blue 625 
624 


r 


N* J 


75% 


A . 


Blue 524 
525 


r 


N j 


10% 



Pairwise recomjoma 1 1 or.s or one slu~> ->tu w;. w.. v = „<-w- 

and another on the fragment on the opposite side of the opening 
20 site. ND = not determined but a high number. 

Row 1 and 2 (in Table 13) have the mutations located at the same 
place as row 1 and 2 in Table 1A. As can be seen the number of 
colonies with lipase activity is clearly higher for the stop codon 
25 mutations compared to the frameshift mutations, but the same 
relative difference between the "mirror image" experiments. 

This might indicate that the stop codon mutations, which is closer 
to the "application" of the method, gives a better mixing than 
30 frameshift mutations. Row 3 and 4 confirms that the closer the 

mutation is to the end of the vector the higher chance of mixing. 

Recombination with one or two frameshift mutation in the vector 
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and one or two frameshift mutations in the fragment 

Using the same approach as described above the influence of one or 
two frameshift mutations in the vector and one or two frameshift 
mutations in the fragment were tested using vectors Blue 425, 426 
and 428 (one mutation) and vectors Blue 442, Blue 443 (two 
mutations) and fragments 442 and 443 (two frameshift mutations) 
and fragments 424, 425, 426, 427, 428 (one mutation) and wild-type 
(no mutation) 

The vectors 31ue 442 and 443 are double frameshift mutations: Blue 
442 = 428 + 429 and blue 4 43 = 427 + 4"29 (see Figure 7). 




The result of the :es: is shovr. : r. .able 2 A a:.: Table 2 5 



Vector - 
fragment 


N umb e r of 
colonies 


S of col . r. i e s with active 
Lipoiase 


1. Blue 425 t 
4^2 


142 


15% 


* 2. Blue 425 - 
443 


Ill 


1 4 % 


2. Blue 42 6 -f 
442 


4 2 


42% 


4 . Blue 426 -r 
4 43* 


77 


20% 


5. Blue 423 + 
443 


115 


3.8% 



One frameshift mutation on the vector and two on the fragment on 
each side of the opening site. = determined by 6 plates. 



Table 23 



Vector + Fragment 


Number of 
colonies 


% of colonies with active 
Lipoiase 


Blue 442 + 424 


137 


0.5% 


31ue 442 + 426 


118 


1.1% 


Blue 442 + 4271? 


125 


1.3% 


Blue 443 + 425 


540 


2.5% : 
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Blue 443 + 426 


196 


1.5% 


Blue 443 + 428 


469 


3.1% 


Blue 442 + wt 
fragment 


135 


7.7% 


Blue 443 + wt 
fragment 


488 


10.7% 



site^and one on the fragment. # determined by 6 plates. 



10 



15 



Table 2A shows a rather high number of colonies with lipase 
activity even with a total of 3 frameshifts (but only one 
frair.eshift on the vector) except for .the last row where the 
frameshift on the vector is located far from the opening site. Lane 
4 has fewer actives char, lane 2 probably due to that the frameshift 
or. the vector is located further away from the opening site than 
the frameshift or. the fragment making the active genes mosaics that 
are r.ot related tc the opening site :=ee figure 2A) . Ir. Table 2B a 
very low number zz actives are observed when there are 2 
frameshifts located or. the vector. Most of these active colonies 
are mosaics of the "parent" ' DSA meaning that the mixing is r.ct 
reiated to the opening site (see figure 23). 



Recombination with twc different vectcrs or fragments 



The result of recombination, with two different vectcrs o: 
fragr.mer.ts the test is shown ir. Table 3 

Table 3 



X 



25 



Vector * Fracr.er.C 


Number of 
colonies ' 


1 of colonies with active 
Lipolase 


Blue 428/pstI - 
Blue 429/pst # 


13 


15% 


Biue428/pst + 3iue 
429/PstI + 442 


273 


4.2% 


Blue 442/pstI + 428 + 
429 


223 


0.8% 


Blue 443/pstI + 427 + 
428 


22S 


1.6% 

or f ranmpnr c; ii Dp t P rmi necl 



by 1 plate. 

A low number of colonies are seen for the control experiment in row 
1 of table 3 as expected. The fragment added in the middle row has 
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two frameshifts each corresponding to the frameshift on each 
vector. Via a tripartite recombination 4.2% actives are created. 
With two fragments with each one frameshifts and a vector with the 
same two frameshifts very few actives are found. 

Recombination with vectors opened at different sites 



Opening the vector in one side instead of approximately in the 
middle still gives good recombination as shown in Table 4. Two 
vectors opened at different sices can also recombine to some extent 
(compare with the vector controls in table 13) . 



Vector ~r : r a erne n t 




* i ■ urr jc e r o z 


i % of colonies with active 
Li po la se 


Blue 4 2 5 /x ho ~ 4 29 




! :i% 


Blue 4 2 3 /xhe-Blue 
429/ps t = 






! S.3% 

i 


Od e n i r. c of t r. e vector i r 
determined by 5 plates. 






Recombi na t icr. at ciffere 


r.t c en cent rat i : 


ns of vector and fragment 


The relative c e r. c e r. c r a t i 


or. z f v e c t c r t : 


tragment do influence the 


oer cent ace of ccsicive c 


denies as car. 


be seen i n Table 5 . 






Table 5 




Vector -!- Fragment 




Number of 
colonies 


% of colonies with lipase 
activity 




O.S^il Blue 42 6 + 




42 I 


42% 




3ui 4 42 










1 . 5jil 3iue 4 26 + 
3ul 442 






51% 




1 . 5ui Blue 426 r 
9ul 442 


34 


26% 




1.5ul Blue 426 + 
3|il 427 


230 


2.8% 




Ijii Blue 442 •+ l|il 
'425 


224 


1.16% 




lul Blue 442 + 2ui 


429 


0.9% 
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425 






Blue 442 + 4ul 
425 


434 


1.6% 


lul Blue 442 + 8ul 
425 


481 


1.6% 


lu.1 Blue 442 + 16|il 
425 


497 

^ ■ *f t-ho vpr.to: 


2.0% 
r or fraamer.t. 



Recombination with frag ments of different size 

, . aicn in*'i uences the record! nation result 

The size of the fragment also ~i — ^ 

as seen in Table 6. 



Vector - Fragment 


Number of ; % of colonies with active 
colonies : Lipoiase 


Blue 424 ~ 425 
(2 50bp) 


75 


3iue 424 + 425 
(46 9b?) 


130 ! 45% 

i 


Blue 4 2 4. - 424 
( 4B0bp) 


133 1 °- 3% 

1 » 


Blue 424 * 42= 
(4 50b?} 


»o ! 


36% 


Blue 42E - 425 
(420b?) 


150 


23% | 


Blue 425 - 424 
(4S0b?) 


69 


0% 


Blue 425 + 423 
(480bp) 


63 


55% 

_ 



(480bp) 

,1-bmaticn with scalier fragments tna 



Recombination with unopen ed vectors 

Transformation with unopened vectors shows a very low degree 
recombination (Table 7) . 



Table 7 



Plasmid 


Number of 
colonies 


% of colonies with active 
Lipoiase 


31ue 428 + Blue 
429 


887 


0.3% 
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Blue 426 + Blue 


697 


0.7% 


425 







Recombination of unopened plasmids. 



Example 4 

Test of 5. cerevi siae mutants altered in recombination 

Using the same approach as described in Example 3 recombination of 
opened and unopened vectors and fragments were tested using a 
Saccharomyces cerevisiae rad52 mutant as the recombination host 
cell. The result is displayed in Table 8. 

Table 3 



Vector + 
Fragment 


Number of 
colonies 


\ of colonies with active 
Lipolase 


3iue 423 - 429 




0 




C 


3iue 442 - 427 




0 




0 


Blue 424 r 425 ' 




0 






3iue 4 25 44 5 ! 0 


0 


Piasmid oJSO ! 
37 * j 




54 4 




100* 


Re comb i n a t i o r. result 


ir. ra d52 mu 


za 




The result with rac'5 


2' showed tha 




recombination was completely 


abolished. The 


352 


function is 




ecuired for classical 


recombination {ru 




zz for unequ 


ai 


sister-s tranc mitotic 


recombination) sr 




r. z that t h e 


re 


combination of opened vector 


fragment ccuid 


v o 1 


ve a classic 


ai 


recombination mechanism. 


Example 6 










Recombination of 


mul 


tide Dartia 


I 


overlaooina fraaments 



In order to increase the mixing of the mutations by the 
recombination method of the invention/ recombination of two 
fragments and one gapped vector were attempted. 



Table 15 



Vector + Fragment 


Number of 


% of colonies with lipase 




colonies 


* activity 


1. pJS037/HindIII-XhoI 


> 2000 


100% 


+ PCR319+PCR327 






2. pJS037/HindIII-XhoI 


~ 2000 


= 0.2% 


+ PCR321+PCR331 
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3. pJS037/HindIII-XhoI 
+ PCR319+PCR331 


~ 1500 


* 1% 


4. pJS037/HindIII-XhoI 
+ PCR319+PCR386 


> 5000 


> 90% 


5. pJS037/KindIII-XhoI 
+ PCR321+PCR386 


> 5000 


« 25% 


6. Blue 428/HindIII- 
Xhol + PCR321+PCR331 


400 


0.2% 


7. Blue 428/HindIII- 
Xhol + PCR319+PCR327 


* 1500 


-».t_J> 90% 


8. Blue 428/HindIII- 
Xhol + PCR321+PCR327 


= 150 


* 10% 


9. Blue 423/HindIII- 
Xhol + PCR327+PCR335 


~ 1500 


= 10% 


10. 31ue 429/HindIII- 
Xhol + ?CR313*?CR3B6 \ 


i 


* 15% 


11. Blue 429/HindIII- 
Xhol r ?CR321~?CR336 1 




~ 1 5 % 


12. 3lue 442/HincIII- 
Xhol -r PCR319-PCR327 




* 10^ 


X h o I - 


2 1 0 
1 


Xhol - 




0 


15. Blue 442/Hir.d:::- 

XhOl r 




0 


15. Blue 428/Hir.dIII- 
Xhol -r PCR331 




0 


17. 3lue 426/HindIII- 
Xhol + PCR321 


2 

- , . ^ ~ - a rr— 3 r e p n r 


0 

» canned vector. The; last 



Recombination result o: two 



As can be seen in Table 15, the recovery of the Humicola lanuginosa 
lipase gene is very efficient. The last 5 rows in Table 15 shews 
that the opened vector alone or with only one fragment not covering 
the whole gap (see figure 3) gives only very few colonies. 

The first row is with wild type fragments gives 100% of active 
colonies . 



The second row is with two fragments each containing a frameshift. 
The fragment PCR331 fragment has the frameshift located at the 
Bglll site which, in this recombination, is not covered by a wild 



WO 97/07205 



39 



PCT/DK96/00343 



type fragment (see figure 3) and therefore gives about 0% of active 
lipase. The same is the case for row 3 and 6. 

In the row 4, fragment PCR386 containing a frameshift at the SphI 
5 site which is overlapped by wild type sequences in the gapped 
vector. The frameshift was recorobined into less than 10% of the 
genes which is lower than the result for one fragment recombination 
in the last row of Table 1A above. 

10 In row 5 a rather high mixing is observed between the 2 fragments 

each containing a frameshift and the wild type gapped vector givinc 
251 active and 75* inactive lipase "colonies . This is probably due 
to that the fragment PCR321 has the frameshift in the overlap 
between the 2 :rag™en:s. and in the capped region of the vector. If 

15 fragment PCR336 contributes to 10% inactives like in row , 
fragment PCR32 1 gives the remaining ft* inactives - therefore 
PCR336 gives 35* w: m the overlap. 

Row 7 is the "mirror image" cf rcw 4 -*-ich the frameshift at the 
20 SphI site on the vector {see Figure ~ and 2 wild type fragments 

giving an integration cf tne wile type fragment into more than 90% 
of the vectors. 

Row E shows like in, rev 5 that the frameshift of PCR321 in the 
2 5 overlap and gap region gives a very high number cf inactive. 

In row 9, fragment PCR3E5 with a frameshift in the vector overlap, 
causes a very high number of inactives. 

30 Row 10 gives a rather high- number of inactives compared to row 7 
and 4 . It is not increased in row 11. 

Row 12 shows that two frame shifts on the vector gives a lower 
number of actives compared to one in row 7. 

35 

The recombination of 3 partial overlapping fragments into a gapped 
vector is also very efficient as seen in Table 16. The last row 
with the vector alone gives very few colonies. As can be seen in 
figure 4 all fragments used are wt . In the first row in table 16, 
40 there are rather long overlaps between the vector and fragments, 

but in the middle row the overlap between PCR353 and 355 is only 10 
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bp long and it is still very efficiently recombined! This 
surprising result may be utilized for very easy domain shuffling of 
even distantly related genes. For example can 3 different domains 
from 10 different genes be made as PGR fragments, designed to have 
5 a 10 to 20 bp overlap by primer design and recombined together and 
subsequently screened for the best combination (1000 possible 
combinations). 



Table 15 




Vector + fragment 


Number of 
colonies 


% of colonies with active 
Lipolase 


pJS037/?vuI I-Spel + 
FC?.3 53-?CR35 4-?CR3c" 


> 5000 


100% 


oJS03 7/?vuI I-Spel + 
?C?.3 5 3*?CR3 5 5-?CR3S7 


> 5000 


100% 


pJ5037/?vui:-£peI 


20 i 100% 






: c a o o e c vector. The - a s t 



row is a csn.rcl. 



20 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

5 

(i) APPLICANT: 

(A) NAME: Novo Nordisk A/S 

(B) STREET: Novo Alle 

(C) CITY: Bagsvaerd 
10 (E) COUNTRY: Denmark 

(F) POSTAL CODE (ZIP): DK-2830 

(G) TELEPHONE: +45 4 4 44 8S83 

(H) TELEFAX: +45 4449 3256 - 

(ii) TITLE OF INVENTION: Method for preparing polypeptide variants 
15 (iii) NUM3ER OF SEQUENCES: 15 

(iv) (iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DCS /MS-DOS 

20 { D) SOFTWARE: Patent!.- Release 41.0, Version =1.303 <EPO! 

12: INFORMATION FOR SED ID NC : 1: 

25 :d sequence characteristics: 

{ A ; LENGTH: 2: base pairs 
,'=■: TYPE: nucleic acic 
;:; STPANDECNESS : sm;le 
(D! TOPOLOGY : linear 
30 (;;) "DLECULE TYPE: otner nuclei: eric 

(A* DESCRIPTION: .'test - ' Pnr.er 2 B -i 3 " 
(xi) SEQUENCE DESCRIPTION: SEQ II NO: i: 



35 



A C A_A--. C A . „ .-. '■w :Gv — 'o - o 

(2) INFORMATION FOR SEO I- N 



;i) SEQUENCE CHA?ACTE?.:S7i:S: 
(A; LENGTH: IS base pairs 
40 (z) TYPE : nutleic a::: 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /cesc - "Primer 4539" 
4 5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

CGGTACCCGG GGATCCAC — 
(2) INFORMATION FOR SEQ ID NO: 2: 

50 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 18 base pairs 
(3) TYPE: nucleic acic 
(C) STRANDEDNESS : sir.cie 
55 (0} TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /cesc = "Primer 5164" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

60 AATTACATCA TGCGGCCC 18 
(2) INFORMATION FOR SEQ ID NO: 4: 



(i) SEQUENCE CHARACTERISTICS : 
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(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION : /desc = "Primer 8487 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

CATTTGCTCC GGCTGCAGGG A 21 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 base pairs 

(B) TYPE: nucleic acid 
(CJ STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "?ri~er 4543" 
(xi) SEQUENCE DESCRIPTION: SZQ ID NO: 5: 

GTGTTCCGCC GGTCTGTACG GTCAGG.---. . - C .-.* w -* - 
■2; INFORMATION FOR. SEQ ID NO: 6: 



(A) LEN 



1 j . r. 



(5) TYPE : nucie:: acic 

[Z) STRANDEDNESS: single 

(D) TOPOLOGY : linear 
Ui) MOLECULE TYPE : other nuclei: acic 

(A) DESCRIPTION: 'desc "Fn...e: ^75 H 
(xi) SEQUENCE DESCRIPTION: SEO IE Nj: 

GGTCTGTr.CG GiLr,'jo.---.i ^ 

:2) INFORMATION FOP. SZQ ID NI : 

(i) SEQUENCE CHAP^OTEPISO: 

(A) LENGTH: 19 base pairs 
(3) TYPE : nuclei: acid 

(C) STRANDEDNESS: si: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acic 

(A) DESCRIPTION: /desc = "?ri~er 5578" 
(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: ~: 

CGTTTCGGGT GACGGGGAv*, ^ 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: - /desc = "Primer 1596" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

GGAGCAAATG TCATTTAT 18 
(2) INFORMATION FOR SEQ ID NO: 9: 
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(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 64 base pairs 

(B) TYPE: nucleic acid 
(Cj STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Primer 4545" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

GCATTGGCAA CTGTTGCCGG AGCAGACCTG CGTGGAAATG 

GGTATGATAT CGACGTGTTT TCAT 64 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 876 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : circular 

i i ) MOLECULE TYPE: other nucleic b z ~ z 

,'A) DESCRIPTION: /zesz = " Ve ; : : : pJS02 5" 

(viJ ORIGINAL SOCRCE: 

(ix) FEATURE: 

(Ai NAME /KEY : CDS 

(xi) SEQUENCE INSCRIPTION : S7C ID NC : 10: 

ATG AGG ACC TCC CTT GTC C7G 7 7 1 77T GTw - 7 . c-o *oo r\ _ C oC C 1 To 4 B 

s '°t r-**c S° - 5 •=> - ' om v- • L°u Pr.e Phe V a I Ser Ala Tro Thr Ala Leu 
'"7 " ' " " = "* io * 15 

GCC ACC CC7 AT 7 CC7 CCA GAC G7C 7CG CAG CAT C7C TTT AAC CAG TTC 95 

p ■ a ^ - * t a 1 - - G'u Val Ser Gin Aso Leu Phe As n Gin Phe 

' 20 ' 23 30 

AAT C7C 777 GCA CAC 7A7 7C7 GCA GCC GCA CAC i GC GGA AAA AA C AAT 14 4 

Asn Leu Phe Ala Glr. 7vr Ser Ala Ala Ala Tyr Cys Gly Lys A sr. Asn 

3 5 * 4 0 - 4 5 

GAT GCC CCA GC7 GG7 ACA AAC ATT ACG TGC ACG GGA AAT GCC TGC CCC 192 
Aso Ala Pro Ala Giv Thr Asn lie Thr Cys Thr Gly Asn Ala Cvs Pro 
50 ' 55 60 

GAG GTA GAG AAG GCG GAT GCA ACG TTT CTC TAC TCG TTT GAA GAC TCT 240 
Glu Val Giu Lvs Ala Aso Ala Thr Phe Leu Tyr Ser Phe GIu Asp Ser 
65 * 70 75 80 

GGA GTG GGC GAT GTC ACC GGC TTC CTT GCT CTC GAC AA.C ACG AAC AAA 28 3 

Gly Val Gly Aso Val Thr Gly Phe Leu Ala Leu Asp Asn Thr Asn Lys 
* 85 50 95 

TTG ATC GTC CTC TCT TTC CGT GGC TCT CGT TCC ATA GAG AAC TGG ATC 336 
Leu He Val Leu Ser Phe Arg Gly Ser Arg Ser He Glu Asn Trp He 
100 105 110 

GGG AAT CTT AAC TTC GAC TTG AAA GAA ATA AAT GAC ATT TGC TCC GGC 384 
Gly Asn Leu Asn Phe Aso Leu Lys Giu He Asn Asp He Cys Ser Gly 
115 * 120 125 
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20 



4 0 



45 



55 



65 



TGC AGG GGA CAT GAC GGC TTC ACT TCG TCC TGG AGG TCT GTA GCC GAT 
Cys Arg Giy His Aso Gly Phe Thr Ser Ser Trp Arg Ser Val Ala As? 
130 * 135 I 40 



5 ACG TTA AGG CAG AAG GTG GAG GAT GCT GTG AGG GAG CAT CCC GAC TAT 
Thr Leu Aro Gin Lvs Val Glu As? Ala Val Arg Glu His Pro Asp Tyr 
i 4 5 ' ' 150 155 160 

CGC GTG GTG TTT ACC GGA CAT AGC TTG GGT GGT GCA TTG GCA ACT GTT 
10 Arg Val Vai Phe Thr Gly Kis Ser Leu Gly Gly Ala Leu Ala Thr Val 

165 I 70 175 

GCC GGA GCA GAC CTG CGT GGA AAT GGG TAT GAT ATC GAC. GTG TTT TCA 
Ala Gly Ala Aso Leu Arg Gly Asn Gly Tyr Asp lie Asp -VaJL -Phe Ser 
15 180 185 190 

TAT GGC GCC CCC CGA GTC GGA AAC AGG GCT TTT GCA GAA TTC CTG ACC 
Ty~ Gly Ala Pro Arc Val Glv Asn Arg Ala. Phe Ala Glu Phe Leu Thr 
155 ' 200 . 205 

GTA CAG ACC CCC GGA AC A CTC TAC CGC ATT ACC CAC ACC AAT GAT ATT 
Val Glr. Thr Glv Civ T.-.r Leu Tyr Arg He His Thr As.-. As? lie 
210 " ' 21= 220 



Cys Leu * 
2 9G 



•2) information re?, si; :c no: 



(1) SEQUENCE CKAPA.CTERISTICS : 
{A) LENGTH: 2 92 amino aeics 
50 { B) TYPE : a-ir.c acici 

;D) TOPOLOGY: linear 



{ii) MOLECULE TYPE : protein 
(xi) £IQu£NCH: DESCRIPTION: SEQ TO NO: ii: 

Mot P~Q Se- Ser Leu Val Leu Phe Phe Vai Ser Ala Trp Thr Ala Leu 
i 5 10 15 

Ala Ser Pro lie Arc Arc Glu Vai Ser Gin As? Leu Phe Asn Gla Phe 
60 * 20 25 30 

Asn Leu Phe Ala Gin Tvr Ser Ala Ala Ala Tyr Cys Giy Lys Asn Asn 
35 ' 40 45 



Aso Ala Pro Ala Glv Thr Asn lie Thr Cys Thr Gly Asn Ala Cys Pro 
50 * 55 60 



432 



480 



528 



576 



624 



672 



2 3 gtc ::t a:- a ::: ccg g:g ggg gaa : . : ggt tag ago cat tct ago cca 120 

Va'> Pro Arr Leu Pro Pro Aro Glu Pne Giy Tyr Ser His Ser Ser Pro 

225 ' 23: * ^5 240 

GAG TAG TGG AT G AAA TTT GGA AGG GTT GTG ZZZ GTG AGO GGA AAC GAT 755 

30 Glu Tvr Trc He Lvs Ser Giv Tnr Leu Vai Pre Val -r Arc ,.s:. Asp 

24 5 2 5C 2=3 

ATG GTG AA.G ATA GAA. GGG ATT GAT GGG AGO GGG GGC AAT- AAC CAG COT 316 

lie Vai Lvs :ie Glu Glv Tie Aso Ala Thr Giy Giy Asn Asn Gip. Pro 
35 * HZ ' 265 270 

AAC ATT CCG GAT ATG GGT GGG CAG GTA TGG TAG TTC GGG TTA ATT GGG S54 

Asn He Pro Aso He Pre Ala His Leu Trp Tyr Phe Gly Leu lie Giy 

27 5 " 2-:: 2 = 5 



37 6 
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Glu Val Glu Lys Ala Asp Ala Thr Phe Leu Tyr Ser Phe GIu Asp Ser 
65 70 75 80 

Glv Val Gly Asp Val Thr Gly Phe Leu Ala Leu Asp Asn Thr Asn Lys 
5 85 50 95 

Leu lie Val Leu Ser Phe Arg Gly Ser Arg Ser lie Glu Asn Trp lie 
100 105 110 

10 Glv Asn Leu Asn Phe Asp Leu Lys Glu lie Asn Asp lie Cys Ser Gly 
115 120 12o 

Cys Arg Gly His Asp Gly Phe Thr Ser Ser Trp Arg Ser_V.al__ Ala Asp 
130 135 140 

15 Th- Leu Arg Gin Lys Val Glu Asp Ala Val Arg Glu His Pro Asp Tyr 
145 150 I 55 I 60 

A-c Val Val Phe Thr Gly His Ser Leu Giy Gly Ala Leu Ala Thr Val 
20 ' 165 i7 ° 175 

Ala Glv Ala Aso Leu Arg Gly Asn Gly Tyr Asp lie Asp Val Phe Ser 

ISO - rD 

25 Tvr Glv Ala Pro Arc Val Gly Asn Arg Ala Phe Ala Glu Pr.e Leu Thr 



30 



45 



60 



65 



Val Gin Thr Gly Gly Thr Leu Tyr Arg He Tr.r His 
210 2i:- 



225 



240 



GH Tvr Tro He Lvs Ser Glv Thr Leu Val ?rs Val Thr Arg Asn Asp 

35 " ' 245 ' 251 255 

I- Val Lvs lie Glu Glv He As? Ala Thr Hy Gly Asn Asn Gin Pro 

' ?;n * 255 270 



40 Asn He Pro Aso He Pro Ala His Leu Tr? .yr - ne ^iy 
275 2 50 "3 



Leu He Glv 



Ihr Cvs Le ; 
290 



[2] INFORMATION FOR SEQ 10 NO: 12: 



(i) SEQUENCE CHARACTERISTICS : 
50 (A) LENGTH: 8 "? 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: circular 

55 Hi) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /6esz = "Sector pJSG37 



(vi) ORIGINAL SOURCE : 

(B) STRAIN: Kumicola lanuginosa 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
.(B) LOCATION : 1 . .87 6 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
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ATG AGG AGC TCC C^T GTG C7G TTC TTT GTC TCT GCG TGG ACG GCC TTG 
Met i% Ser Ser Eeu Val Leu Phe Phe Val Ser Ala Trp Thr Ala Leu 
1 5 1° 15 

GCC AGT CCT ATA CGT AGA GAG GTC TCG CAG GAT CTG TTT AAC CAG TTC 
Ala S»r Pro lie Arg Arg Giu Val Ser Gin Asp Leu Phe Asn Gin Phe 
20 25 30 

AAT C^C TTT GCA CAG TAT TCA GCT GCC GCA TAC TGC GGA AAA AAC AAT 
Leu Phe Ala Gin Tyr Ser Ala Ala Ala Tyr Cys Gly Lys Asn Asn 

35 40 

GAT GCC CCA GCA GGT ACA AAC ATT ACG TGC ACG GGA AAT GCA TGC CCC 
Aso Ala Pro Ala Glv Thr Asn lie Thr Cys Thr Gly Asn -Al-a__Cys Pro 
* 50 55 60 



GAG GTA GAG AAG GCG GAT GCA ACG TTT CTC TAC TCG T:T GAA G«C TCT 

Glu Val Glu Lvs Ala As? Ala Thr Phe Leu Tyr Ser Phe Giu Asp Ser 

65 * 70 . '5 80 

GGA GTG GGC GAT GTC ACC GGC TTC CTT GCT CTC GAC AAC ACG AAC AAG 

Glv Val Giv Aso Val Thr Gly Phe Leu Aia _eu ms? *sr. :r.r Asr. lvs 

c - 90 * 3 



48 



96 



144 



192 



240 



286 



„eu ::e V: 



leu Ser Phe Arg 



. VJ'O . i U 

lie Glu Asr; Trp lie 



;35 



ser uiv 



120 



L 25 



lvs Arg Gly His As: 
130 



~~r ATT TCG TTC TGG AGG TCT GTA" GCC GAT 
Phe Thr Ser Ser Trp Arg Ser Val Ala Asp 
i 14 0 



CAT CCC 



ACG TTA AGG CAG AAG o.-j - -~ - --.*-■- u- - * ° * * h"-" T " ^ < r r "*™ 1^ 

' vs Val Glu Asr Ala Val -rg Giu His Pro Asp Tyr 

150 160 



ICC GTG GTG TTT A33 GGT CA. 

vs" V'pi =h* Thr Glv His Se: 



GGT GGT GCG CTA GCA ACT GTT 
Giv Glv Ala Leu Ala Thr Val 
170 * 17: 



432 



430 



523 



jCC GGA GCA GAC CTG CGT - 
-.la Giv A.ia Aso Leu Arc G_y .-.sr. 
30 



GGG TAT G A T ATC GAC GTG TT : TCA 
•^'v Tvr Asr; lie Aso Val Phe Ser 
lis * 190 

AAC CGT GCT TTT GCA GAA. TTC CTG ACC 
&s^. Arc Ala Phe Ala Glu Phe Leu Thr 
200 205 



TAT GGC GCC CCC CGA GTC GG . 
Tvr Giv Ala Pro Arg Vai Giy 
195 

GTA CAG ACC GGC GGT ACC CTC TAC CGC ATT ACC CAC ACC AAT GAT ATT 

Vai Gin Thr Gly Giv Thr Leu Tyr Arc lie :r.r r.is Thr Asn Asp lie 

210 ' 215 220 

GTC CCT AGA CTC CCG CCT CGA GAA TTC GGT TAC AGC CAT TCT AGC CCA 

Val Pro Arc Leu Pro Pro Arg Giu Phe Giy Tyr Ser His Ser Ser Pro 

225 * 230 23d 240 

GAG TAC TGG ATC AAA TCT GGA ACA CTA GTC CCC GTC ACC CGA AAC GAT 

Glu -v- T-d U» Lvs Ser Giy Thr Leu Val Pro Val Thr Arg Asn Asp 

245 250 255 

ATC GTG AAG ATA GAA GGC ATC GAT GCC ACC GGC GGC AAT AAC CAG CCT 

Ti e Val Lvs lie Glu Glv lie Aso Aia Thr Gly Gly Asn Asn Gin Pro 

* 260 " ' 265 270 



576 



52^ 



672 



720 



768 



816 



WO 97/07205 



47 



PCT/DK96/00343 



AAC ATT CCG GAT ATC CCT GCG CAC CTA TGG TAC TTC GGG TTA ATT GGG 8 64 

Asn lie Pro Asd lie Pro Ala Kis Leu Trp Tyr Phe Gly Leu lie Gly 
275 * 280 285 

ACA TGT CTT TAG 876 
Thr Cys Leu * 
290 

(2) INFORMATION FOR SZQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 292 amino acids 
(3) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: orotein 

(xi) SEQUENCE DESCRIPTION: SZQ ID MO : 13: 

Met Arc Ser Ser Leu Val Leu Phe Phe Val Ser Ala Tro Thr Ala L*- 
1 5 10 ^ 15 

Ala Ser Pro Tie Arc Arc Glu Val Ser 3 1 r. Aso leu Phe Asn Gin 
2Z ' ' 25 30 

A s r. Leu Phe Ala Gin Tvr Ser Ala Ala Ala T"r Cvs Glv Lvs Asn As - 

2= 4: 45 

Asp Ala ?:; Ala Glv 7r. r Asn. He Tr.r Cvs Tr.r Glv Asn Ala Cvs Pre 
50 Er 50 

Glu Val Glu lys Ala Asp Ala Tr.r ?ne Leu Tyr Ser Phe Glu Asp Ser 

Gly Val Glv Aso Val Tr.r Glv Pne leu Ala Leu Aso Asn Thr Asn Lvs 
85 90 * 95 

Leu He Val Leu Ser ?r.e Ar:: Glv Ser Arc, Ser He Glu Asn Trp Ii<=> 
10: ' ' 105 U0 

Gly Asn Leu Asn Phe Aso leu lvs Glu He Asn Aso lie Cvs Ser Glv 
1:5 120 * . 125 

Cys Arg Glv His As- Glv Phe Tr.r Ser Ser Tro Arc Ser Val Ala Aso 
130 * * 135 140 

Thr Leu Arc Gin Lvs Val Glu Aso Ala Val Arc Glu His Pro Asd Tyr 
145 * 150 155 ' 160 

Arc Val Val Phe Thr Gly His Ser Leu Glv Glv Ala Leu Ala Thr Val 
155 170 175 

Ala Gly Ala Asd Leu Arc Glv Asn Glv Tvr Aso He Asd Val Phe Ser 
180 185 * 190 

Tyr Gly Ala Pro Arc Val Glv Asn Arc Ala Phe Ala Glu Phe Leu Thr 
195 ' 200 205 

Val Gin Thr Gly Glv Thr Leu Tyr Arg lie Thr His Thr Asn Asp He 
210 215 220 

Val Pro Arg Leu Pro Pro Arg Glu Phe Gly Tyr Ser His Ser Ser Pro 
225 230 235 240 

Glu Tyr Tro He Lvs Ser Gly Thr Leu Val Pro Val Thr Arg Asn Asp 
245 250 255 
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lie Val Lys lie Glu Gly lie Asp Ala Thr Gly Giy Asn Asn Gin Pro 
260 265 

Asn lie Pro Aso lie Pro Ala Kis Leu Trp Tyr Phe Gly Leu lie Gly 
275 * 280 

Thr Cvs Leu • 
290 

(2) INFORMATION FOR SEQ ID NO: 14: 



(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 864 base pairs 

(3) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ON A (genomic) 



60 



65 



( v 1 ) ORIGINAL SOURCE : 

•5) STRAIN: Pseuccrr.cnas so. 



2 5 (ixj FEATURE: 

[A; NAME/KEY: ~a-_pep:ice 
(B) LOCATION : 1 . . 554 

I FEATURE : 

30 ; A; NA>'E / KEY : GTS 

;B) LOCATION: 1 . . 554 

ixi) SEQUENCE DESCRIPTION: SEQ 10 NC : 14: 

35 — ~ T -- — -i- A" AA3 AGE CAG TAT CCG ATC GTC CTG ACC 43 

Phe Gly s:: S^r ^ T^r Tnr Lys Thr Gin Tyr Pro lie Val Leu Thr 

_ — r -~ — - " 3: CTG CTT GCA GTC GAG TAG TGG TAG 9c 

" Kis cly ^ Leu Gly ? r.e aV p ser Leu Leu Giy Vai As? Tyr Trp Tyr 

_ .„ , _ c ^ ~ — GAC GGC GEC ACC GTC TAG GTC ACC 144 

.5 Giv ?ie Pre Se"r aH Le^ Arc Lys Asp Giy Ala Thr Val Tyr Val Thr 

35 ^° 4:) 

_ r^r — C ~CC GAA GCC CGA GGT GAG CAA. CTG CTG 

G~ Val Ser Leu Asp Vh^ Ser Glu Ala Arg Gly Glu «„ Leu Leu 
50 50 55 60 

Arc CAA GTC GAG GAA ATC GTG GCC ATC AGC GGC AAG CCC AAG GTC AAC 
Thr Gin Val Glu Glu lie Val Ala He Ser Giy Lys Pro Lys Val Asn 
65 ™ a0 

" CTG TTC GGC CAC AGC CAT GGC GGC- CCT ACC ATC CGC TAC GTT GCC GCC 288 
Leu Phe Giy His Ser Kis Giy Giy Pro Tnr lie Arg Tyr Val «la Ala 

90 



e5 



GTG CGC CCG GAT C~G GTC GCC TGG GTC ACC AGC ATT GGC GCG CCG CAC 
Va? So Pro Aso Leu Val Ala Ser Vai Thr Ser lie Giy Ala Pro His 
100 110 

AAG GGT TCG GCC ACC GCC GAC TTC ATC CGC CAG GTG CCG GAA GGA TCG 
Lvs Giv Ser Ala Thr Ala Asp Phe lie Arg Gin Vai Pro Glu Gly Ser 
* ' " H5 120 l 2 ^ 



192 



240 



336 



384 



WO 97/07205 



PCT/DK96/00343 



49 



GCC AGC GAA GCG ATT CTG GCC GGG ATC GTC AAT GGT CTG GGT GCG CTG 4 32 

Ala Ser Glu Ala He Leu Ala Gly He Val Asn Gly Leu Gly Ala Leu 
130 135 140 

ATC AAC TTC CTT TCC GGC AGC AG" TCG GAC ACC CCA CAG AAC TCG CTG 4 80 

He Asn Phe Leu Ser Gly Ser Ser Ser Asp Thr Pro Gin Asn Ser Leu 
145 150 155 160 

GGC ACG CTG GAG TCA CTG AAC TCC GAA GGC GCC GCA CGG TTT AAC GCC 528 
Gly Thr Leu Glu Ser Leu Asn Ser Glu Gly Ala Ala Arg Phe Asn Ala 
165 170 175 

CGC TTC CCC CAG GGG GTA CCA ACC AGC GCC TGC GGC GAG__G.GC_GAT TAC 57 6 

Arg Phe Pro Gin Gly Vai Pro Thr Ser Ala Cys Gly Glu Gly Asp Tyr 
180 185 190 

GTG GTC AAT GGC GTG CGC TAT TAC TCC TGG AGG GGC ACC AGC CCG CTG 624 
Val Val Asn Gly Val Arg Tyr Tyr Ser Trp.Arg Gly Thr Ser Pro Leu 
195 200 * 205 



ACC AAC GTA CTC GAC CCC TCC GAC CTG CTG CTC GGC GCC ACC TCC CTG 67 2 

-.sn Val Leu Aso Pro Ser 
:10 * 215 



Thr Asn Val Leu Aso Pro Ser Aso Leu Leu Leu Glv Aia Tr.r Ser Leu 

220 



ACC TTC GGT TTC GAG GCC AAC GAT GGT CTC TTT GGA CGC TGG AGC TCG 720 
Thr Phe Gly Phe Glu Ala A sr. Aso Glv Leu Val Gly Arc Cys Ser Ser 
225 22j 125 24: 

CGG GTG GGT ATG GTG ATC CGC GAC AAC TAC C^G ATG AAC GAC GTG GAC 7 5 S 

Arc Leu Glv Me: Val He Arc Aso Asr. Tyr Ar; :• ■ e : Asn His Leu Aso 
2^5 250 255 

GAG GTG AAC CAG ACC TTC GGG CTG ACC AGC ATC TTC GAG' ACC AGC CCG S16 
Glu Val Asn Gin Thr ?he Glv Leu Thr Ser He Phe Glu Thr Ser Pro 
250 * 265 270 

GTA TCG GTC TAT CGC CAG CAA. GCC AAT C o C ~ - G nA.G A.-,C GCC GGG CTC 8 64 

Val Ser Val Tvr Arc Gin Gin Ala Asn Arc Leu Lvs As?. Ala Glv Leu 
275 ' ' 2S0 2S5 

;2; ::;ro?j'iATiGN ro?. sc: ID NO: 1£: 

(i) sequence CHARACTERISTICS: 

(A) LENGTH: 238 amino acids 

(3) TYPE : amino acid 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Phe Glv Ser Ser Asn Tyr Thr Lys Thr Gin 'Tvr Pro lie Vai Leu Thr 
15 10 15 

His Gly Met Leu Gly Phe Aso Ser Leu Leu Gly Val Aso Tvr Trp Tyr 
20 25 30 

Glv lie Pro Ser Ala Leu Ara Lvs Aso Glv Ala Thr Val Tyr Val Thr 
35 ' 40 45 

Glu Val Ser Gin Leu Aso Thr Ser Glu Ala Arc Gly Glu Gin Leu Leu 
50 * 55 60 

Thr Gin Val Glu Glu He Val Ala lie Ser Gly Lys Pro Lys Val Asn 
65 70 75 80 

Leu Phe Gly His Ser His Gly Giy Pro Thr He Arg Tyr Val Ala Ala 
85 90 95 
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15 



20 



25 



Val Arg Pro Asd Leu Val Ala Ser Val Thr Ser :ie Gly Ala Pro His 



100 



105 



110 



Lys Gly Ser Ala Thr Ala Aso Phe lie Arg Gin Val Pro Glu Gly Ser 

115 * 120 125 

Ala Ser Glu Ala lie Leu Ala Giv lie Val Asn Gly Leu Gly Ala Leu 

130 135 140 

lie Asn Phe Leu Ser Gly Ser Ser Ser Asp Thr Pro Gin Asn Ser Leu 

145 150 155 160 

Gly Thr Leu Glu Ser Leu Asn Ser Glu Gly Ala Ala Arg Ptte'-Asn Ala 



165 



170 



175 



Arg Phe Pro Gin Glv Val Pro Thr Ser Ala Cys Gly Glu Gly Asp Tyr 



180 



135 



190 



Val Vai Asn Glv Vai Arg Tvr Tvr Ser Trp Arg Gly Thr Ser Pro Leu 
195 " " 200 205 



rhr Asr. Val Leu Asr Pro Ser Aso Le 
210 " 21: 



s- Leu Sly Ala Thr 
"7 2 j 

Asr. Asp Sly Leu Val Sly Arg Cys 



5 e r • « ■ 



Aso .-.s: 



30 ' 245 

Glu Val Asn Gin Thr ?~e 
260 

35 Vai Ser Vai Tyr Ar; Sir. 
2 7 5 



Asn -is Leu -.so 
255 



rr.e 01 



r-ro 



2 7 0' 



.e; Lvs Asn Ala Glv Leu 
235 
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PATENT CLAIMS 

1. A method for preparing polypeptide variants by shufflinc 
different nucleotide sequences of homologous DNA sequences by ir. 
vivo recombination comprising the steps of 

a) forming at least one circular plasmid comprising a DNA sequence 
encoding a polypeptide, 

b) opening said circular plasmid(s) within the DNA seauence (s) 
encoding the polypeptide (s) , 

c) preparing at least one DNA fragment comprising a DNA sequence 
homologous to at least a part of the polypeptide coding recion on at 
least one of the circular plasmid {s) , d) introducing at least one 
of said opened plasmid {s), together with at least one of said- 
homologous DNA f ragmen t { s ] covering full-length DNA secuences 
encoding said polypeptide's) or parts thereof, into a reccmbina tion 
host ceil, 

e } cultivating said recombir.a t i on host ceil, and 
f) screening for positive polypeptide variants. 

2. The method according to claim 1, wherein more than one cycle of 
step a) to f } are performed. 

wherein two or more 
nore homologous DNA 



h. The method according to any of claims 1 to 3, wherein the opened 
plasmid(s) is(are) gapped. 

5. The method according to any of claims 1 to 4 wherein the ratio 
between the opened plasmid (s) and homologous DNA fragment (s) are in 
the range from 20:1 to 1:50, preferable from 2:1 to 1:10 {mol 
vector :mol fragments) with the specific concentrations being from 1 
pM to 10 M of the DNA. 

6, The method according to any claims 1 to 5, wherein 2 or more, 
preferably from 2 to 6, especially 2 to 4 of the DNA fragments have 
partially overlapping regions. 



ooenec oi asm ids a: 
fragments in the sa; 



i n c to c x a i m s 
shuffled with 
shuffling cvcle 



anc i , 
one or 
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7. The method according to claim 6, wherein the overlapping regions 
of the DNA fragments lies in the range from 5 to 5000 bp, preferably 
from 10 bp to 500 bp, especially 10 bp to 100 bp. 

8. The method according to any of claims 1 and 8, wherein at least 
one cycle of step a) to f) is backcrossing with the initially used 
DNA fragments. 

9. The method according to any of claims 1 and 8, wherein the 
plasmid(s) is (are) opened in the region around the middle of the DNA 
sequence (s) encoding the polypeptide ( s ) . 

10. The method according to any of claims 1 to 9, wherein the 
piasmid(s) is (are} opened close to a -ucation ir. the DNA sequence ( s } 
encoding the polypeptide ( s ) . 

11. The method according to any : : claims 1 to 1C, wherein the DNA 
f raoment { s * creoared in step c; is .are! prepared under conditions 
suitable for hi oh, medium or low mutagenesis. 

12. The method according to any c; claims I to II, wherein the 
poi voeo t ice s producible from the input DNA sequences are enzymes or 
proteins with biological activity. 

to claim 12, wherein the polypeptides are 
the group including proteases, lipases, 
viases, peroxidases, oxidases and phytases. 

14. The method according to claim 12, wherein the polypeptides are 
proteins with biological activity selected from the group including 
insulin, ACTH, glucagon, somatostatin, somatotropin, thymosin, 
parathyroid hormone, pigmentary hormones, somatomedin, erythro- 
poietin, luteinizing hormone, chorionic gonadotropin, hypothalamic 
releasing factors, antidiuretic hormones, thyroid stimulating 
hormone, reiaxin, interferon, thrcmbopoietin (TPO) and prolactin. 

15. The method according to any of claims 1 to 13, wherein at least 
one of the initially used input DNA sequences is a wild-type DNA 
sequence, such as a DNA sequence coding for wild-type enzymes, in 
particular lipases, derived from filamentous fungi, such as Humicola 
sp., in particular Humicola lanuginosa, especially Humicola 



-j. :ne metnoc acccrc 
e n z VTT.e s selected frc 
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lanuginosa . DSM 4109. 

16. The method according to claim 15, wherein at least one of the 
input DNA sequences is selected from the group of vectors (a) to (f) 
and/or DNA fragments (g) to (aa) coding for Humicola lanuginosa 
lipase variants. 

17. The method according to any of claims 1 to 13, wherein at least 
one of the initially used input DMA sequences is a wild-type DNA 
sequence, such as a DNA sequence coding for wild-type enzymes, in 
particular lipases, derived from filamentous fungi of the cenera 
Absidia, Rhizopus, Enericella, Aspergillus, Penicillium, 
Eupenicilliux, Paecilomyces , Ta laror.yces , Thermo a sous and 
Sclerocleisza . 

IS. The method according to any of eiai-s i and 12, wherein at least 
one of the initially used input DN\A sequences is a wiid-tvce DN 
sequence* such as a 2 N A sequence c:::n: for wild-tvpe er.zvmes, i 
particular lipases, derived zrz~ bacteria, such as Pseudcrr.or.as sp. 
in particular Ps . fragi. Ps . szuzzeri, Ps . cepacia r Ps . fluorescens 
Ps. planzarii , Ps . gladioli , Ps . a lea 1 igen.es , Ps . pseudoalcaliger.es 
Ps . menace in a , Ps. a urogir.es a , Ps. zi urr.a e , Ps. svrir.cae, Ps 
wisconsinensis , or a strain of Bacillus sp . , in particular 3 
is, B. szearczr.er~opr.ilus or c: B. pumilus, or or a strain o 
epecrr/ces sp., in particular 5. scabies, or a strain of 
Chrorr.ohaczeriu.T; sc. in particular C. vi s cosun . 



sue 

c *- - 



19. The method according to any of claims 1 to 12, wherein at least 
one of the initially used input DNA sequences is a variant DNA 
sequence, such as a DNA sequence ceding for a variant enzyme, in 
particular lipase variants, derived frcm yeasts, such as Candida 
sp., in particular Candida rugosa, or Geozrichurr. sp t/ in particular 
Geotrichun candidun . 

20. The method according to any of claims 1 to 19, wherein the 
homologous input DNA sequences are at least 60%, preferably at least 
70%, better more than 80%, especially more than 90%, and even up tc 
100% homologous. 

21. The method according to any of claims 1 to 20, wherein the 
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recombination hose ceii is a eukaryotic ceil, such as a fungal cell 
or a plant ceil . 

22. The method according to claim 21, wherein said fungal cell is a 
5 yeast ceii from the group of ceii of Saccharomyces sp., in 
particular strains of Saccharomyces cerevisiae or Saccharomyces 
kluyveri or Schizosaccharorr.yces sp., in particular 

Schizosaccharomyces pom-be, or Kl uyveromyces sp., such as K. lactis, 
or Hansenvla sp., in particular X. polyniorpha, ~-ox--2ichia sp., in 
10 particular ?. paszoris, or a filamentous fungi from the group cf 
Aspergillus sp., in particular A. niger, A. nidulans or A. oryzae { 
or Neurospora sp., or Fusarlum sp.,. in particular F. oxysporum, or 
Trichoderma sp.. 
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A7GAGGAGC7CCC77G7GC7G77C777G7C7C7GCG7GGACGGCC77GGCCAG7CC"A77 
5447 — + + + + + + ™„_ 

1 MRSSLVLrFVSAWTALAS^I 



CGTCGAGAGGTCTCGCAGGATCTGTTTAACCAGTTCAATCTCTTTGCACAGTATTC7GCA 
5507 — + + + + ~~ + + 5566 

21 RREVSQDLFNQFNLFAQYSA 



GCCGCA7AC7GCGGAAAAAACAA7GA7GCCCCAGC7GG7ACAAACA77ACG7GCACGGGA 

55 67 — - — -r -t -f — 7 - — + — — — — — — — — — 4.___.,__ 

41 AAYCGKNNDAPAGTNITC7G 



AATGCCTGCCCCGAGG7AGAGAAGGCGGATGCAACG777C7CTACTC5ITTGAAGAC7CT 
5627 * + ■*- ~ + 5686 

61 NACPEVEKADATFLYSFEDS 



GGAG7GGGCGA7G7CACCGGC77CCT7GC7C7CGACAACACGAACAAA7TGA7CG7CC7C 
5667 * * * ^ ^ 574 6 

31 G V G D V 7 G F L A I D N 7 N K L I V L 
_ ^ 7C777CCG7GGC7C7CG77CCA7AGAGAAC7GGA7GGGGAA7C77AAC77CGAC77GA^ 

5 F ?. G S K S I E N I G N* 1 S F 0 I. K "-~° J 

c ~' ~ ,~ ~ - ~ ~ ~ ™ - ~ ~ ^ Z * 5 925 

5927 :::::r:r^:::::::^^::::: 5936 

- v v : i u H S L G 3 A L A ? V A G A D 
G . GCG7GGAAJ\7GGG7A7GA7A7CGACG7G7777CA7A7GGCGCCCCCCGAG7CGGAAAC 

r;r' _ I ~ ~ ~ * ~ " 60s 6 

i = - o N o : ^ I D V F S Y G A ? ?. V G N 

_ , AGGGG7777GCAGA.A77CC7GACCG7ACAGAGGGGGGGA_-.CAC7C7ACCGCA77A:CCAC 

^i; €iC5 

^-1 - - : ~ i- 1 7 V Q 7 G G 7 L Y R I 7 u; 

A CC AA7G AT A7 7 G 7 G G C 7A G AC 7 CC C G CC G CG C G AA 7 7 C GG 7 7 AC A G CC A77C7 A GC CC A 
c 107 * ^ n 

221 7 N D I V ? R L ? ? F £ F G Y S H S 3 ?~ D -"° 
g ^ 6 7 G AG T AC 7 G G AT C AAA7 C 7G G AAC CC 7 7 G 7 C C C C G 7 G A C C C G AA-AC G A 7 A7 CG 7 G AA. GA.7 A 

2<i e y w i "k"s i"T"rT"r""T"rr"r"rT7<"T 5 - 25 



G AAGG CA7CG A7GCCACCGGCGGCAA7 AACC AG GG 7. ---.CA77CCGGA7A7CCCTG r ^A^ 

6227 + * . + 

2 61 E G I D A 7 G G N N C ? N I P D I P A H 



CTATGGTACTTCGGGTTAA.TTGGGACA7G7C777AG 
6237 + . 6322 

26i LWYFGLIG7CL- 
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ATGAGGAGCTCCCT7GTGCTGTTCTTTGTCTCTGCG7GGACGGCCT7GGCCAGTCCTATA 
5447 — + - + + _ , 5506 

1 MRSSLVLFFVSAWTALASPI 
SnaBI PvuII 
CGTAGAGAGGTCTCGCAGGATCTGTTTAACCAGTTCAATCTCTTTGCACAGTATTCACCT 

5566 



5507 



21 RREVSQDLFNQFNLFAQYS 

C3CCGCATACTGCGGAAAAAACAATGATGCCCCAGCAGG7ACAAACA7TACGTGCACGGGA 
5567 — . _ + + . + . 5626 

41 AAYCGKNNDAPAGTNITCtg 
SphI 

AATGCATGCCCCGAGGTAGAGAAGGCGGATGCAACGTTTCTCTACTCGTTTGAAGACTCi 

+ + + + 5686 

6i NAC^PEVEKADATFLYSFEDS 

Hindi I I 

GGAGTGGGCGATG7CACCGGC77CCTTGCTCTGGAGAACACGAACAAGCTTATCG7CC7C 

* . + 57 , 5 

Bi G V G D V T G F" L A L D V T N 1 K L I V L 
Bglll 

: C * i i w^GTuGC i CAAGATCTA ^ .-. G A 7 . *-' : <-",*-.T7GGGAA7777AA.7777G - ~ ^ * i * 



5627 



5637 



556 6 



5926 



3 '.' A Z 7 L ?. ri .-! '.' L 7 A '.' - " - 
Bs.XI Nhe T 

. 593: 



3stEi: 
* 60-56 



2:i R A FA Z T I 7 V I 7 G G 7 1 ?. I 



.-. C C AA.7G A7 A.7 7 G 7 7 7 7 7 A 7 A. 7 7 7 C C G 7 C 7 C GAG AA7 7 7 G G 7 7 A 7 A Z C C A 7 7 7 7 A G 777 A 
6107 * - . 6166 

221 ? N D I V ? R L ? ? R E F G Y 5 K S S ? 

Spel 

G AG 7 A 7 7 GG A7 C AAA 7 C 7 G G AA 7AC TAG 7 7 7 7 7 G 7 C AC C C G AAAC G A7 A7 C G 7 G A_AG A7 A 
5157 . . 6226 

24 1 E Y W I K S G 7 L V ? V T R N D I V K I 

^- w i . ; \_ „ 'sj o A7 AT CCC 7 G 7G 7A.C 
5227 — + - + 6235 

2 51 E G I D A T G G N : N C P N. I ? D I ? A H 

wini cGTACTTCGGGTTAATTGGGA.CAT 77 7 - 7 TAG 

6237 r . 6322 

231 LWYFGLIG7 7L* 
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|PCr" fragment from blue 427 




lB!ue 442 ; 
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